[PDF] Semitauonic b-hadron decays: A lepton flavor universality laboratory

Abstract

The study of lepton flavor universality violation (LFUV) in semitauonic b-hadron decays has become increasingly important in light of longstanding anomalies in their measured branching fractions, and the very large datasets anticipated from the LHC and Belle II. In this review, we undertake a comprehensive survey of the experimental environments and methodologies for semitauonic LFUV measurements at the B-factories and LHCb, along with a concise overview of the theoretical foundations and predictions for a wide range of semileptonic decay observables. We proceed to examine the future prospects to control systematic uncertainties down to the percent level, matching the precision of Standard Model (SM) predictions. Furthermore, we discuss new perspectives and caveats on combinations of the LFUV data and revisit the world averages for the {\cal R}(D^{(*)}) ratios. Here we demonstrate that different treatments for the correlations of uncertainties from D^{**} excited states can vary the current 3\sigma tension with the SM within a 1\sigma range. Prior experimental overestimates of D^{**}\tau\nu contributions may further exacerbate this. The precision of future measurements is also estimated; their power to exploit full differential information, and solutions to the inherent difficulties in self-consistent new physics interpretations of LFUV observables, are briefly explored.

Full PDF

SSemitauonic b -hadron decays: A lepton ﬂavor universality laboratory Florian U. Bernlochner ∗ Physikalisches Institut der Rheinischen Friedrich-Wilhelms-Universit¨at Bonn, 53115 Bonn, Germany

Manuel Franco Sevilla † University of Maryland, College Park, MD, USA

Dean J. Robinson ‡ Ernest Orlando Lawrence Berkeley National Laboratory, University of California, Berkeley, CA, USA

Guy Wormser § Laboratoire Ir`ene Joliot-Curie, Universit´e Paris-Saclay, CNRS/IN2P3, Orsay, France (Dated: January 22, 2021)

The study of lepton ﬂavor universality violation (LFUV) in semitauonic b -hadron decayshas become increasingly important in light of longstanding anomalies in their measuredbranching fractions, and the very large datasets anticipated from the LHC and Belle II. Inthis review, we undertake a comprehensive survey of the experimental environments andmethodologies for semitauonic LFUV measurements at the B -factories and LHCb, alongwith a concise overview of the theoretical foundations and predictions for a wide range ofsemileptonic decay observables. We proceed to examine the future prospects to controlsystematic uncertainties down to the percent level, matching the precision of StandardModel (SM) predictions. Furthermore, we discuss new perspectives and caveats on com-binations of the LFUV data and revisit the world averages for the R ( D ( ∗ ) ) ratios. Herewe demonstrate that diﬀerent treatments for the correlations of uncertainties from D ∗∗ excited states can vary the current 3 σ tension with the SM within a 1 σ range. Priorexperimental overestimates of D ∗∗ τν contributions may further exacerbate this. Theprecision of future measurements is also estimated; their power to exploit full diﬀeren-tial information, and solutions to the inherent diﬃculties in self-consistent new physicsinterpretations of LFUV observables, are brieﬂy explored.To be submitted to Reviews of Modern Physics . CONTENTS

I. Introduction 2II. Theory of Semileptonic Decays 3A. SM operator and amplitudes 3B. Hadronic matrix elements and form factors 4C. Theoretical frameworks 51. Dispersive bounds 62. Heavy quark eﬀective theory 63. Quark models 74. Lattice calculations 8D. Ground state observables and predictions 81. Lepton universality ratios 82. Longitudinal and polarization fractions 9E. Excited and other states 10F. b → ulν processes 11G. Inclusive processes 11H. New Physics operators 11I. Connection to other processes 12III. Experimental Methods 12A. Production and detection of b -hadrons 12 ∗ ﬂ[email protected] † [email protected] ‡ [email protected] § [email protected] 1. The B factories 132. The LHCb experiment 14B. Particle reconstruction 161. Charged particle reconstruction 162. Neutral particle reconstruction 16C. Kinematic reconstruction: The b -hadron momentum 171. B tagging at the B factories 182. τ → π − π + π − ν vertex reconstruction at LHCb 193. Rest frame approximation with τ → µνν atLHCb 19IV. Experimental Tests of Lepton Flavor Universality 20A. B -factory measurements with hadronic tags 211. R ( D ( ∗ ) ) with τ → (cid:96)νν B → πτν decays 24B. Belle measurements with semileptonic tags 251. R ( D ( ∗ ) ) with τ → (cid:96)νν R ( D ∗ + ) with τ → µνν R ( D ∗ + ) with τ → π − π + π − ν R ( J/ψ ) with τ → µνν τ polarization with τ → πν and τ → ρν D ∗ polarization with inclusive tagging 33V. Common Systematic Uncertainties and Future Prospects 34A. Monte Carlo simulation samples 35B. Modeling of B → D ( ∗ ) lν B → D ∗∗ (cid:96)ν and B → D ∗∗ τν backgrounds 361. Systematic uncertainties evaluation and control 36 a r X i v : . [ h e p - e x ] J a n . Introduction D ∗∗ branching fraction assumptions in R ( D ( ∗ ) )analyses 37D. Modeling other signal modes 38E. Other background contributions 38F. Other systematic uncertainties 38VI. Combination and Interpretation of the Results 39A. Dissection of R ( D ( ∗ ) ) results and SM tensions 39B. Revisiting R ( D ( ∗ ) ) world averages 40C. Inclusive versus exclusive saturation 42D. New Physics interpretations 421. Parametrization of SM tensions 422. Sensitivity and biases in recovered observables 44E. Connection to FCNCs 45VII. Prospects and Outlook 45A. Measurement of the ratios R ( H c,u ) 451. Prospects for R ( H c,u ) at LHCb 462. Prospects for R ( H c,u ) at Belle II 47B. Exploiting full diﬀerential information 491. Angular analyses and recovered observables 492. Future strategies 49C. Outlook for future colliders 50D. Parting thoughts 50Acknowledgments 50References 51 I. INTRODUCTION

Over the past decade, collider experiments haveprovided ever-more precise measurements of StandardModel (SM) parameters, while direct collider searchesfor new interactions or particles have yielded ever-morestringent bounds on New Physics (NP) beyond the SM.This, in turn, has brought renewed attention to the NPdiscovery potential of indirect searches: measurementsthat compare the interactions of diﬀerent species of ele-mentary SM particles to SM expectations.A key feature of the Standard Model is the universal-ity of the electroweak gauge coupling to the three knownfermion generations or families. In the lepton sector, thisuniversality results in an accidental lepton ﬂavor sym-metry, that is broken in the SM (without neutrino massterms) only by Higgs Yukawa interactions responsible forgenerating the charged lepton masses. A key prediction,then, of the Standard Model is that physical processesinvolving charged leptons should feature a lepton ﬂa-vor universality : an approximate lepton ﬂavor symmetryamong physical observables, such as decay rates or scat-tering cross-sections, that is broken in the SM only bycharged lepton mass terms in the amplitude and phasespace. (Eﬀects of additional Dirac or Majorana neutrinomass terms in extensions of the SM are negligble in allcontexts we consider.) In the common parlance of theliterature, testing for lepton ﬂavor universality violation(LFUV) in any particular process thus refers to mea-suring deviations in the size of lepton ﬂavor symmetrybreaking versus SM predictions. An observation of LFUV would clearly establish thepresence of physics beyond the Standard Model, andcould thus provide an indirect window into resolutionsof the nature of dark matter, the origins of the matter-antimatter asymmetry, or the dynamics of the elec-troweak scale itself. Decades of LFUV measurementshave yielded results predominantly in agreement withSM predictions. Various strong constraints have been ob-tained from (semi)leptonic decays of light hadrons, gaugebosons, or leptonic τ decays (see e.g. (Zyla et al. , 2020)),among a plethora of many other measurements. A no-table recent addition is the measurement of B ( W → τ ν ) / ( W → µν ) (Aad et al. , 2020), resolving a long-standing LFUV anomaly from LEP that deviated fromthe SM prediction at 2 . σ . Moreover, sources of LFUVthat implicate NP interactions with the ﬁrst two quarkgenerations are typically strongly constrained by e.g.precision K - K and D - D mixing measurements. SuchLFUV bounds involving third generation quarks, how-ever, are typically much weaker (Cerri et al. , 2019).This review focusses on the rich experimental land-scape for testing LFUV in semileptonic b -hadron decays.Not only do these decays provide a high statistics labora-tory to measure LFUV that is (relatively) theoretically-clean, but results from the last decade of measurementshave indicated anomalously high rates for various semi-tauonic b → cτ ν decays compared to precision SM pre-dictions. In particular, the ratios R ( D ( ∗ ) ) = B ( B → D ( ∗ ) τ ν ) B ( B → D ( ∗ ) (cid:96)ν ) , (cid:96) = e , µ , (1)where D ( ∗ ) refers to both D and D ∗ mesons, deviatefrom SM predictions at the 3 σ level when taken to-gether (Amhis et al. , 2019). (We revisit later the con-struction of these world averages and their degree of ten-sion with the SM.) Apart from these results, there are ad-ditional measurements for various other b → cτ ν decaysand other observables, including R ( J/ψ ), the τ polariza-tion and D ∗ longitudinal fractions (see Sec. IV). Someof these measurements presently agree with SM predic-tions only at the 1 . − . σ level, and when combinedwith R ( D ( ∗ ) ) can mildly increase the degree of tensionwith the SM. Some tensions also currently exist in sev-eral b → see versus b → sµµ transitions, each at the 2 . σ level (Aaij et al. , 2017b, 2019c).Upcoming runs of the LHC, the high-luminosity (HL)-LHC, and Belle II will yield large new datasets for a widerange of b → cτ ν and b → uτ ν processes. Given this ex-pected deluge of data, it is important to review and syn-thesize our understanding of the various strategies andchannels through which LFUV might be discovered. Tothis end, we undertake this review along two diﬀerentthreads. First, in Sec. II we provide a compact yet com-prehensive overview of the current theoretical state-of-the-art for the SM (and NP) description of semitauonic I.A Theory of Semileptonic Decays b -hadron in Sec. III.C and provide a com-parison between the two hadronic B tag measurementsof R ( D ( ∗ ) ) by B A B AR and Belle in Sec. IV.A.1.These two threads of the review are woven in Secs. Vand VI into discussions of the main challenges arisingfrom systematic uncertainties, and into discussions ofcurrent interpretations and combinations of the data, re-spectively. In particular, in Sec. V we provide an ex-tended analysis of the main sources of systematic uncer-tainty in the LFUV measurements, and the prospects tocontrol them in the future down to the percent level. Thiswill be essential for establishing a conclusive tension withthe Standard Model. We examine key challenges in com-putation, the modeling of b -hadron semileptonic decaysin signal and background modes, and estimations of otherimportant backgrounds. We also point out the potentialsensitivity of R ( D ( ∗ ) ) analyses to the assumptions usedfor the B → D ∗∗ τ ν branching fractions (Sec. V.C.2),which are presently overestimated compared to SM pre-dictions.Section VI begins by examining the R ( D ( ∗ ) ) resultsand other SM tensions for diﬀerent light-lepton nor-malization modes or isospin channels, before turning torevisit entirely the world average combinations of the R ( D ( ∗ ) ) ratios. We speciﬁcally analyze the sensitivityof these combinations to the treatment of the correlationstructure assigned to the uncertainties from B → D ∗∗ (cid:96)ν decays across diﬀerent measurements, and show they mayvary the degree of their current ∼ σ tension with theSM over approximately a 1 σ range. As an illustration,incorporating such correlations as a free ﬁt parameterin the combination, we show that the resulting R ( D ( ∗ ) )world averages would feature a tension of 3 . . σ higherthan the current world average (Amhis et al. , 2019). Wefurther explore a comparison of inclusive versus exclu-sive measurements; caveats and challenges in establish-ing NP interpretations of the current R ( D ( ∗ ) ) anomalies;and possible connections to anomalies in neutral currentrare B decays.Beyond the current state-of-the-art, in Sec. VII we pro-ceed to explore the power of future LFUV ratio measure-ments for a variety of hadronic states, taking into account the discussed prospects for the evolution of the system-atic uncertainties and the data samples that LHCb andBelle II are expected to collect over the next two decades(Sec. VII.A). The power of future analyses to exploit fulldiﬀerential information is brieﬂy explored (Sec. VII.B), aswell as the role of proposed future colliders (Sec. VII.C). II. THEORY OF SEMILEPTONIC DECAYS

In this section we introduce the foundational theoret-ical concepts required to describe b → clν semileptonicdecays. Throughout this review, we adopt the notation l = τ , µ , e , (cid:96) = µ , e . (2)While our focus is the Standard Model (SM) descrip-tion of b → clν , in some contexts we shall present amodel-independent discussion, in order to accommodatediscussion of beyond Standard Model (BSM) physics.We discuss ﬁrst B → D ( ∗ ) lν decays, since they are ofpredominant experimental importance in current mea-surements, before turning to processes involving excitedstates, charm-strange mesons, charmonia, baryons, aswell as b → ulν and inclusive processes. The LFUV ob-servables (anticipating their deﬁnitions below) for whichpredictions are discussed, and their respective sections,comprise: R ( D ( ∗ ) ) : Sec. II.D.1 F L ( D ∗ ) , P τ ( D ( ∗ ) ) : Sec. II.D.2 R ( D ∗∗ ) : Sec. II.E R ( D ( ∗ ) s ) : Sec. II.E R ( J/ψ ) : Sec. II.E R (Λ ( ∗ ) c ) : Sec. II.E R ( π ) : Sec. II.F R ( ρ ) , R ( ω ) : Sec. II.F R ( X c ) : Sec. II.G . A. SM operator and amplitudes

In the SM, b → clν processes are mediated by the weakcharged current, generating the usual V − A four-Fermioperator O SM = 2 √ G F V cb (cid:0) cγ µ P L b (cid:1)(cid:0) ¯ lγ µ P L ν l ) , (3)at leading electroweak order. Here the projectors P L,R =(1 ∓ γ ) /

2, and G − F = 8 m W / ( √ g ) = √ v , with v (cid:39) .

22 GeV and g denoting the SU (2) weak couplingconstant. In diagrammatic language, the correspondingamplitude for this charged current process A SM = g V ∗ cb / √ g / √ b c W −∗ ¯ νl − q µ , (4) I.B Theory of Semileptonic Decays φ l θ l p = p p l k ¯ ν l ¯ ν φ v θ v p = q p π p D D ∗ φ h θ h q = k ¯ ν p h k ν τ Figure 1 Left: Deﬁnition of the θ l and φ l helicity angles inthe lepton pair rest frame. Center: Deﬁnition of the θ v and φ v helicity angles in the D ∗ rest frame. Right: Deﬁnition ofthe θ h and φ h helicity angles in the τ rest frame frame, for B → D ( ∗ ) ( τ → hν )¯ ν decay. in which the quarks may be ‘dressed’ into various diﬀer-ent hadrons. It is conventional to deﬁne the momentum q = p − p (cid:48) = p l + p ν where p ( p (cid:48) ) is the beauty (charm)hadron momentum.The leptonic amplitude W → (cid:96)ν always take the formof a Wigner- D function D jm ,m ( θ l , φ l ), with j = 0 or 1,and | m , | ≤ j . The helicity angle θ l is deﬁned hereinas in Fig. 1. We show also in Fig. 1 the deﬁnition ofhelicity angles for subsequent D ∗ → Dπ or τ → hν de-cays, for example, where h is any hadronic system or (cid:96)ν .The helicity angle deﬁnition also applies for the case of D ∗ → Dγ , though with a diﬀerent fully diﬀerential rate.Some literature uses the deﬁnition θ l → π − θ l , such thatcaution must be used in adapting ﬁts to fully diﬀerentialmeasurements from one convention to the other. Thephase φ l is unphysical unless deﬁned with reference tospin-polarizers of the charm or beauty hadronic systemor the lepton, such as the subsequent decay kinematics ofthe τ or charm hadron, or the spin of the initial b -hadron.For example, in B → ( D ∗ → Dπ ) (cid:96)ν , the only physicalphase is χ ≡ φ l − φ v . B. Hadronic matrix elements and form factors

The predominant theory uncertainty in B → D ( ∗ ) lν arises in the description of the hadronic matrix elements (cid:104) D ( ∗ ) | c Γ b | B (cid:105) , where (anticipating a discussion of NewPhysics (NP) below) Γ is any Dirac operator. More gen-erally, one seeks a theoretical framework to describe thematrix elements (cid:10) s c +1 ( L c ) J c (cid:12)(cid:12) c Γ b (cid:12)(cid:12) s b +1 ( L b ) J b (cid:11) , usinghere the spectroscopic notation to describe the hadron All deﬁnitions and sign conventions below apply to b → c tran-sitions; they may be extended to b → c with appropriate signchanges. To emphasize this, while we do not typically distin-guish between B → D ( ∗ ) and B → D ( ∗ ) in this discussion, we doretain such notation in the explicit deﬁnition of matrix elementsor where charge assignments of other particles have been madeexplicit. Throughtout the review, inclusion of charge-conjugatedecay modes is implied, unless stated otherwise. in terms of its quark constituents’ total spin s , their or-bital angular momentum L = S , P , D , . . . , and the totalangular momentum of the hadron J . We focus ﬁrst onthe description for B → D ( ∗ ) , i.e. S → S or S :The ground state charmed mesons.For B → D ( ∗ ) SM transitions, the matrix elements arerepresented by two (four) independent form factors. Interms of three common form factor bases, (cid:10) D (cid:12)(cid:12) cγ µ b (cid:12)(cid:12) B (cid:11) = f + ( p + p (cid:48) ) µ + ( f − f + ) q µ ( m B − m D ) /q (5a)= √ m B m D (cid:2) h + ( v + v (cid:48) ) µ + h − ( v − v (cid:48) ) µ (cid:3)(cid:10) D ∗ (cid:12)(cid:12) cγ µ b (cid:12)(cid:12) B (cid:11) = 2 i (cid:101) g ε µναβ (cid:15) ∗ ν p (cid:48) α p β (5b)= i √ m B m D ∗ h V ε µναβ (cid:15) ∗ ν v (cid:48) α v β = 2 iV ( m B + m D ∗ ) ε µναβ (cid:15) ∗ ν p (cid:48) α p β (cid:10) D ∗ (cid:12)(cid:12) cγ µ γ b (cid:12)(cid:12) B (cid:11) = f (cid:15) ∗ µ + a + (cid:15) ∗ · p ( p + p (cid:48) ) µ + a − ( (cid:15) ∗ · p ) q µ = √ m B m D ∗ (cid:2) h A ( w + 1) (cid:15) ∗ µ (5c) − h A ( (cid:15) ∗ · v ) v µ − h A ( (cid:15) ∗ · v ) v (cid:48) µ (cid:3) , = A ( m B + m D ∗ ) (cid:15) ∗ µ − A (cid:15) ∗ · p ( p + p (cid:48) ) µ m B + m D ∗ + 2 m D ∗ q µ ( A − A )( (cid:15) ∗ · p ) /q , noting (cid:10) D (cid:12)(cid:12) cγ µ γ b (cid:12)(cid:12) B (cid:11) = 0 because of angular momen-tum and parity conservation. Here we have used thespectroscopic basis { f + , f , f, (cid:101) g, a ± } ; the heavy quarksymmetry (HQS) basis { h ± , h V , h A , , } ; and the basis { V, A , , , } , in which 2 m D ∗ A = A ( m B + m D ∗ ) − A ( m B − m D ∗ ). The velocities v = p/m B and v (cid:48) = p (cid:48) /m D ( ∗ ) , and the recoil parameter w = v · v (cid:48) = m B + m D ( ∗ ) − q m B m D ( ∗ ) . (6)The form factors are functions of q or equivalently w . Their explicit forms may also involve the scheme-dependent parameters m b /m c and α s , though any suchscheme-dependency must vanish in physical quantities.In the HQS basis, h A , and the three form factor ratios R ( w ) = h V h A , R ( w ) = h A + r ∗ h A h A , and (7) R ( w ) = ( w + 1) h A − ( w − r ∗ ) h A − (1 − wr ∗ ) h A (1 + r ∗ ) h A , where r ( ∗ ) = m D ( ∗ ) /m B , fully describe the B → D ∗ tran-sition, noting R enters only in terms proportional to m l .Particular care must be taken with sign conventionsin Eqs. (5): For B → D ( ∗ ) , the conventional choice inthe literature, and here, is such that Tr[ γ µ γ ν γ σ γ ρ γ ] = The form factor (cid:101) g is often written as g , but should not be con-fused with g = 2 (cid:101) g in the helicity basis below. I.C Theory of Semileptonic Decays iε µνρσ , equivalent to ﬁxing the identity σ µν γ ≡− ( i/ ε µνρσ σ ρσ , with σ µν = ( i/ γ µ , γ ν ]. One mayfurther choose either ε = +1 or −

1. In B → D ∗∗ literature, as well as Λ b → Λ c , typically the choiceis instead Tr[ γ µ γ ν γ σ γ ρ γ ] = − iε µνρσ , equivalent to σ µν γ ≡ +( i/ ε µνρσ σ ρσ . These sign choices aﬀect thesign of R , but leave physical quantities unchanged pro-vided they are used consistently both in the form factordeﬁnitions and in the calculation of the amplitudes. Caremust be taken in adapting form factor ﬁt results obtainedin one convention to expressions deﬁned in the other. Inour sign conventions, the form factor ratio R > B → D ∗ decays is thehelicity basis (cf. (Boyd et al. , 1996, 1997)), with formfactors { g, f, F , P } , that are particularly convenient forexpressing the B → D ∗ helicity amplitudes. Explicitrelations between the HQS and helicity bases are h A = fm B √ r ∗ ( w + 1) , h V = gm B √ r ∗ (8a) h A (cid:0) w − r ∗ − ( w − R (cid:1) = F m B √ r ∗ ( w + 1) , (8b) h A R = P . (8c)The SM diﬀerential rate can then be written compactlyin terms of Legendre polynomials of cos θ l , d Γ dw d cos θ (cid:96) = 2Γ (cid:112) w − r ∗ (cid:20) ¯ q − r l ¯ q (cid:21) (cid:40) (9) (cid:18) r l q (cid:19)(cid:0) H + + 2¯ q H (cid:1) + 3 r l q H + cos θ l H +0 + 3 cos θ l − (cid:20) ¯ q − r l ¯ q (cid:21)(cid:0) ¯ q H − H + (cid:1)(cid:41) , in which Γ ≡ G F η | V cb | / (192 π ), r l = m l /m B , ¯ q = q /m B = 1 − r ∗ w + r ∗ , η EW (cid:39) α/π log( m Z /m B ) (cid:39) . H = f r ∗ m B + g r ∗ m B ( w − , (10a) H + = F r ∗ m B , (10b) H = P ( r ∗ +1) ( w − , (10c) H +0 = 6¯ q f g (cid:112) w − − r l ¯ q (cid:112) H + H . (10d)The θ l -independent term in Eq. (9) is simply 1 / d Γ /dw .The overall sign of the cos θ l term, and the relative signof the f g term in H +0 , are sensitive to sign conventions.In the massless lepton limit, it is common to express thediﬀerential rate d Γ /dw in terms of the single form factorcombination F ( w ) = H + + 2¯ q H (1 − r ∗ ) ( w + 1) + 4 w ( w + 1)¯ q , (11) normalized such that F (1) = h A (1).The B → D rate may be expressed similarly. In theform factor basis {G ≡ V , S } , deﬁned via G ≡ V = h + − − r r h − , (12a) S = h + − r − r w − w + 1 h − , (12b)the SM diﬀerential rate has the same form as Eq. (9) andEqs. (10), but with r ∗ → r , H + = V (1 + r ) ( w − , (13a) H = S (1 − r ) ( w + 1) , (13b)and, by deﬁnition, no f or g terms, i.e H = 0 and H +0 = − r l / ¯ q (cid:112) H + H .Note that the expressions of this section apply simi-larly to any other S → S or S transition, such as B → πlν or B → ρlν (with the additional replacementof V cb → V ub ). C. Theoretical frameworks

Various theoretical approaches exist to parametrize the B → D ( ∗ ) or other exclusive decay form factors. Broadly,these fall into four overlapping categories:1. Use of the functional properties of the hadronic ma-trix elements – analyticity, unitarity, and dispersionrelations – to constrain the form factor structure;2. Use of heavy quark eﬀective theory (HQET) to gen-erate order-by-order relations in 1 /m c,b and α s be-tween form factors;3. Various quark models, including those that mayapproximately compute the form factors (in vari-ous regimes), such as QCD sum rule (QCDSR) andlight cone sum rule (LCSR) approaches; and4. Lattice QCD (LQCD) calculations, presently avail-able only for a limited subset of form-factors andkinematic regimes.The details of the various approaches to the form factorparametrization are particularly important for measure-ments that are sensitive to the diﬀerential shape of ex-clusive semileptonic decays, such as the extraction of theCKM matrix element | V cb | . Hadronic uncertainties, how-ever, mostly factor out of observables that consider ratiosof | V cb | -dependent quantities, including e.g. measure-ments that probe lepton universality relations between Some literature uses the notation V , while others G . I.C Theory of Semileptonic Decays B → D ( ∗ ) (cid:96)ν and B → D ( ∗ ) τ ν decays or other exclu-sive processes. Instead, in the latter context, the mainrole and importance of form factor parametrizations liesin their ability to generate predictions for lepton univer-sality relations, and the precision thereof.

1. Dispersive bounds

A dispersion relations-based approach does not alonegenerate lepton universality relations between the B → D ( ∗ ) lν rates or other exclusive processes, but does pro-vide crucial underlying theoretical inputs to approachesthat do. The dispersive approach (Boyd et al. , 1996,1997) begins with the observation that the matrix ele-ment (cid:104) H c | J | H b (cid:105) for a hadronic transition H b → H c , me-diated by current J = c Γ b , may be analytically contin-ued beyond the physical regime q < ( m H b − m H c ) ≡ q − into the complex q plane. For q > ( m H b + m H c ) ≡ q ,where H c,b denote the lightest pair of hadrons that cou-ple to J , the matrix element features a branch cut fromthe crossed process H b H † c pair production. For B → D ∗ processes, it is typical to take q + ≡ ( m B + m D ∗ ) for bothvector and axial vector currents. For e.g. B c → J/ψ ,the branch points are taken as ( m B ( ∗ ) + m D ) for (ax-ial)vector currents. A bc bound state that is created by J but with mass m < q , is a ‘subthreshold’ resonance.The conformal transformation z ( q , q ) = √ q − q − √ q − q √ q − q + √ q − q (14)maps | q | < q ( | q | > q ) to the interior (exterior) ofthe unit circle | z | = 1, centered at q = q . Two commonchoices of q are q − , in which case z ( w = 0) = 0, or q (1 − [1 − q − /q ] / ) ≡ q , which minimizes | z ( q =0) | . This allows the matrix element to be written as ananalytic function of z on the unit disc | z | ≤

1, up tosimple poles that are expected at each ‘sub-threshold’resonance. These poles must fall on the interval q − ≤ q ≤ q ⇔ (0 ≥ ) z − ≥ z ≥ − J = i (cid:82) d xe iqx (cid:104) | T J † ( x ) J (0) | (cid:105) , which obeys a once-subtracted dispersion relation χ J ( q ) ≡ ∂ Π J ∂q = 1 π (cid:90) dt ( t − q ) ImΠ J . (15)The QCD correlator χ J can be computed at one-loopin perturbative QCD for q > q , and then analyticallycontinued to q < q − . ImΠ J may be re-expressed as aphase-space-integrated sum over a complete set of b - and c -hadronic states ∼ (cid:80) X = H b H † c ,... |(cid:104) | J | X (cid:105)| with appro-priate parity and spin. E.g. for J = cγ µ b , one may have H b H † c = BD † , BD ∗† and so on. The positivity of eachsummand allows the dispersion relation to provide an up-per bound—a so-called ‘weak’ unitarity bound—for any given hadron pair H b H † c . (A ‘strong’ unitarity boundwould, by contrast, impose the upper bound on a ﬁnitesum of hadron pairs coupling to J .) Crossing symme-try permits this bounds to be applied to the transitionmatrix elements (cid:104) H c | J | H b (cid:105) of interest here.Making use of the conformal transformation, the uni-tarity bound can be expressed in the form (cid:90) | z | =1 dz πiz (cid:88) i | P Ji ( z ) φ Ji ( z ) F Ji ( z ) | ≤ . (16)in which F Ji is a basis of form factors and the ‘outer’functions φ Ji are analytic weight functions that encodeboth their q -dependent prefactors arising in (cid:104) H c | J | H b (cid:105) ,as well as incorporating the 1 / √ πχ J prefactor. The ad-ditional Blaschke factors P Ji satisfy | P Ji ( | z | = 1) | = 1 byconstruction, and do not aﬀect the integrand on the | z | =1 contour. However, the choice P Ji = (cid:81) α ( z − z α,i ) / (1 − zz α,i ) explicitly cancels the (known) poles at z = z α,i on the negative real axis. Each term in the sum mustthen be analytic, i.e. P Ji ( z ) φ Ji ( z ) F Ji ( z ) = (cid:80) ∞ n =0 a Jin z n ,so that Eq. (16) requires the a Jin coeﬃcients to satisfy aunitarity bound (cid:80) i,n | a Jin | ≤ et al. , 1996, 1997)uses this approach to express the f , g , F and P formfactors in terms of an analytic expansion in z = z ( q , q − ).In particular for the light lepton modes, with F A = f, F , g ( z ) = 1 P V ( z ) φ g ( z ) (cid:88) n a gn z n , (cid:88) n | a gn | ≤ ,F A ( z ) = 1 P A ( z ) φ F A ( z ) (cid:88) n a F A n z n , (cid:88) F A ,n | a F A n | ≤ , noting F ( q − ) /φ F ( q − ) = f ( q − ) m B (1 − r ∗ ) /φ f ( q − ) fromEq. (8b). This relatively unconstrained parameteriza-tion provides a hadronic model-independent approach tomeasuring | V cb | from light leptonic B → D ∗ (cid:96)ν modes,but does not relate B → D ∗ τ ν to B → D ∗ (cid:96)ν : E.g. a ﬁtto light lepton data, taking m (cid:96) →

0, to determine f , g , F provides no prediction for P , and hence no predictionfor the B → D ∗ τ ν rate. (The general SM expectationremains, however, that the unitarity bound for P shouldnot be violated in a direct ﬁt to B → D ∗ τ ν data.) In-stead, additional theoretical inputs are required.

2. Heavy quark eﬀective theory

HQET inputs may be combined with the BGL ap-proach, in order to generate SM (or NP) predictionsfor lepton universality observables. A ‘heavy’ hadronis deﬁned as containing one heavy valence quark, i.e. m Q (cid:29) Λ QCD , dressed by light quark and gluon degrees offreedom—so-called ‘brown muck’—in a particular spin-state. An HQET (Eichten and Hill, 1990; Georgi, 1990;Isgur and Wise, 1989, 1990) (for a review, see e.g. (Neu-bert, 1994)) is an eﬀective ﬁeld theory of the brown

I.C Theory of Semileptonic Decays /m Q . An apt analogy arises in atomicphysics in which the electronic states are insensitive tothe nuclear spin state, up to hyperﬁne corrections. Thisprovides a hadronic model-independent parametrizationof not only the spectroscopy of heavy hadrons but alsoorder-by-order in 1 /m Q relations between their transi-tion matrix elements. The form factors of B → D ( ∗ ) (cid:96)ν are then related to those of B → D ( ∗ ) τ ν , allowing forlepton universality predictions.In this language, the spectroscopic S and S states—e.g. the D and D ∗ or B and B ∗ —may instead beconsidered to belong to a heavy quark (HQ) spin symme-try doublet of a pseudoscalar (P) and vector (V) meson,formed by the tensor product of the light degrees of free-dom in a spin-parity s P(cid:96) = 1 / − state, combined with theheavy quark spin: HQ ⊗ light = ⊕ . Their massescan be expressed as m P,V = m Q + ¯Λ − λ m Q ∓ (2 J V,P + 1) λ m Q + . . . , (17)where m Q is the heavy quark mass parameter of HQET,¯Λ = O (Λ QCD ) is the brown muck kinetic energy for m Q → ∞ , and λ , = O (Λ ). Furthermore, one ex-pects that in the limit that m Q → ∞ (and, α s → B → D ( ∗ ) should be insen-sitive to—and therefore preserve—the spin of the under-lying heavy quarks, while being sensitive to the changein heavy quark velocity.Following this intuition, the QCD kinetic term¯ Q ( i /D − m Q ) Q may itself reorganized into an eﬀectivetheory of brown muck—i.e. an HQET—parametrizedby the heavy quark velocity v = p Q /m Q , featuring an1 /m Q expansion in which the leading order terms con-serve heavy quark spin, while higher order terms in 1 /m Q do not. A heavy quark ﬂavor violating interaction like J = c Γ b can be similarly reorganized, such that at lead-ing order, the transition is sensitive only to the diﬀerenceof the incoming and outgoing heavy hadron velocities v and v (cid:48) , respectively. It is then natural to express the ma-trix elements as in Eq. (5), with the natural form factorbasis in the SM being h ± , h V , h A , , .When organized in this way, the key result is that any B → D ( ∗ ) matrix element can be written as a spin-trace (cid:104) D ( ∗ ) | c Γ b | B (cid:105)√ m D ( ∗ ) m B = − ξ ( w ) Tr (cid:2) ¯ H ( c ) v (cid:48) Γ H ( b ) v (cid:3) + O ( ε c , ε b , ˆ α s ) , (18)where H ( c,b ) are HQET representations of the HQ dou-blet and ξ ( w ) is a leading Isgur-Wise function . Higherorder terms in ε c,b = ¯Λ / (2 m c,b ), can be similarly sys-tematically constructed in terms of universal sublead-ing Isgur-Wise functions, while radiative corrections inˆ α s = α s /π can be incorporated at arbitrary ﬁxed order. Heavy quark ﬂavor symmetry implies that ξ (1) = 1, pre-served at order ε c,b by Luke’s theorem.The CLN parametrization (Caprini et al. , 1998) ap-plies dispersive bounds to the B → D form factor V ,expanded up to cubic order as V ( w ) V (1) = 1 − ρ ( w − c ( w − + d ( w − + . . . . (19)It thus extracts approximate relations between the pa-rameters ρ , c and d , by saturating the dispersivebounds at (the then) 1 σ uncertainty in the QCD cor-relators χ J . The parametrization then makes use ofheavy quark symmetry to relate this form factor to allother form factors in the B → D ( ∗ ) system, incorporat-ing additional, quark model inputs from QCD sum rules(QCDSR, see below), to constrain the 1 /m c,b terms. Inparticular, predictions are obtained for a z expansion of h A , with coeﬃcients dependent only on ρ , plus pre-dictions for R , , ( w ) up to a ﬁxed order in ( w − R i ( w ) = R i (1) + R (cid:48) i (1)( w −

1) + 1 / R (cid:48)(cid:48) i (1)( w − + . . . .The intercepts R i (1) are theoretically correlated order-by-order in the HQ expansion with the slope and gradi-ents R ( (cid:48) , (cid:48)(cid:48) ) i (1), and therefore must be determined simulta-neously when measured. A common experimental ﬁttingpractice of ﬂoating R , (1) while keeping R ( (cid:48) , (cid:48)(cid:48) )1 , (1) ﬁxedto their QCDSR predictions is inconsistent with HQETat subleading order, when ﬁts are performed to recenthigher precision unfolded datasets, such as the 2017 Belletagged analysis (Abdesselam et al. , 2017). The BLPRparametrization (Bernlochner et al. , 2017) removes thisinconsistency, and exploits higher precision data-drivenﬁts to the subleading IW functions to obviate the needfor QCDSR inputs. It furthermore consistently incorpo-rates the 1 /m c,b terms for NP currents, important for NPpredictions of B → D ( ∗ ) τ ν .There has been long-standing debate about the size ofthe 1 /m c corrections, partly because quark model-basedcalculations predicted them to have coeﬃcients some-what larger than unity. Recent data-driven ﬁts, however,in the baryonic Λ b → Λ c system provide good evidencethat the 1 /m c corrections obey power counting expecta-tions (Bernlochner et al. , 2018b); see also (Bordone et al. ,2020) with regard to B ( s ) → D ∗ ( s ) .

3. Quark models

Beyond dispersive bounds and HQET, quark model-based approaches have historically played an importantrole in descriptions of the form factors, and have provideduseful constraints in generating lepton universality pre-dictions. The ISGW2 parametrization (Isgur et al. , 1989;Scora and Isgur, 1995) implements a non-relativistic con-stituent quark model, providing estimates of the formfactors by expressing the transition matrix elements for

I.D Theory of Semileptonic Decays O (1 /m c,b ) constraints from heavyquark symmetry and higher-order hyperﬁne corrections.The ISGW2 parametrization of the form factors istreated as fully predictive, being typically implementedwithout any undetermined parameters. This amounts toﬁxed choices for e.g. the heavy and light quark massesor the brown muck kinetic energy ¯Λ. It therefore is notconsidered to provide state-of-the-art form factors, com-pared to data-driven ﬁts. Non-relativistic quark modelsmay, however, be useful choices for double heavy hadrontransitions such as B c → J/ψ or η c (for a very recent ex-ample see e.g. (Penalva et al. , 2020)), where heavy quarksymmetry cannot be applied.QCD sum rules (QCDSR) exploit the analytic proper-ties of three-point correlators constructed by sandwich-ing an operator of interest with appropriate interpolat-ing hadronic currents. This allows the expression of e.g.,an Isgur-Wise function in terms of the Borel transformof the correlator, the latter oxf which can be computedin perturbation theory via an operator product expan-sion (OPE). One must further assume quark-hadron du-ality to estimate the spectral densities of relevant ex-cited states. Renormalization improved results for the1 /m c,b Isgur-Wise functions and their gradients at zerorecoil are known (Ligeti et al. , 1994; Neubert, 1994; Neu-bert et al. , 1993a,b). While theoretical uncertaintiesassociated with the perturbative calculations are well-understood, there is no systematic approach to assessinguncertainties arising from quark-hadron duality and scalevariations. Rough estimates of the uncertainties are largecompared to the precision obtained by more recent data-driven methods.Light cone sum rules (LCSR) operate in a similar spiritto QCDSR. They describe, however, the regime in whichthe outgoing hadron kinetic energy is large, by reorga-nizing the OPE such that one expands in the ‘transversedistance’ of partons from the light cone. LCSR havebroad application in exclusive heavy-light quark transi-tions, such as for b → u transitions including B → ρ , ω , or π , in which the valence parton is highly boostedcompared to the spectator.

4. Lattice calculations

Lattice QCD (LQCD) results are available for theform-factors at zero recoil for both B ( s ) → D ( s ) and B ( s ) → D ∗ ( s ) , with the most precise B → D ( ∗ ) re-sults (Aoki et al. , 2020) G (1) ≡ V (1) = 1 . stat (8) sys , F (1) = 0 . stat (12) sys . (20) LQCD results for the both the B ( s ) → D ( s ) form fac-tors f ( s )+ , are available beyond zero recoil, with respect tothe optimized expansion in z = z ( q , q ). Results for B ( s ) → D ∗ ( s ) form factors beyond zero recoil are still inprogress.The B → D LQCD data allows for lattice predic-tions for the diﬀerential rate of B → Dτ ν , and whencombined with HQET relations plus QCD sum rulepredictions, may also predict B → D ∗ τ ν , but withslightly poorer precision compared to data-driven ap-proaches (Bernlochner et al. , 2017). Beyond zero-recoilLQCD results are also available for B c → J/ψlν (Harri-son et al. , 2020a) (see Sec. II.E), as well as for the bary-onic Λ b → Λ c lν (Detmold et al. , 2015) decays includingNP matrix elements. D. Ground state observables and predictions

1. Lepton universality ratios

Lepton universality in b → clν may be probed by com-paring the ratios of total rates for l = e , µ and τ , inparticular the ratio of the semitauonic to light semilep-tonic exclusive decays R ( H c ) = Γ[ H b → H c τ ν ]Γ[ H b → H c (cid:96)ν ] , (cid:96) = e, µ , (21)where H c,b are any allowed pair of c - and b -hadrons. (Theratios of the electron and muon modes are in agreementwith SM predictions, i.e. near unity; see Sec. VI.A. Onemay also consider ratios R ( H u ) for H b → H u τ ν decays,in which the valence charm quark is replaced by a u quark.) The ratios R ( H c ) should diﬀer from unity notonly from the reduced phase space as m τ (cid:29) m e,µ , butalso because of the mass-dependent coupling to the longi-tudinal W mode. The theory uncertainties entering intothe SM predictions for this quantity are then dominatedby uncertainties in the form factor contributions couplingexclusively to the lepton mass, such as the form factorratios S /V and R ( w ) in B → D and D ∗ , respectively.In Table I we show a summary of various predic-tions as collated by the Heavy Flavor Averaging Group(HFLAV) (Amhis et al. , 2019). Before 2017, R ( D ( ∗ ) )predictions based on experimental data used the CLNparametrization, since this was the only experimentallyimplemented form factor parametrization. An unfoldedanalysis by Belle (Abdesselam et al. , 2017) has since al-lowed use of other parameterizations, with the diﬀerent(and more consistent) theoretical inputs as described inTable I. At present, given the diﬀerent theoretical in-puts and correlations in the results of these analyses, theHFLAV SM prediction is a na¨ıve arithmetic average ofthe R ( D ) and R ( D ∗ ) predictions and uncertainties, in-dependently. I.D Theory of Semileptonic Decays Table I R ( D ( ∗ ) ) predictions as currently collated and arith-metically averaged by HFLAV. Predictions shown below theHFLAV line are not included in the arithmetic average. Inputs R ( D ) R ( D ∗ ) corr.LQCD+ Belle/BaBar Data a . ± .

003 — —LQCD + HQET O ( α s , /m c,b )+ Belle 2017 analysis b c . ± .

003 0 . ± .

003 0 . ∼ /m c + Belle 2017 analysis d — 0 . ± .

008 —BGL + BLPR + ∼ /m c + Belle 2017 analysis e . ± .

004 0 . ± .

005 0 . . ± .

003 0 . ± .

005 —LQCD f . ± .

008 — —CLN+ Belle Data g — 0 . ± .

003 — a (Bigi and Gambino, 2016) b (Abdesselam et al. , 2017) c The ‘BLPR’ parametrization (Bernlochner et al. , 2017) d Includes estimations of 1 /m c uncertainties (Bigi et al. , 2017) e Fits nuisance parameters for 1 /m c terms (Jaiswal et al. , 2017) f World average (Aoki et al. , 2020) g (Fajfer et al. , 2012) On occasion, the phase-space constrained ratio (cid:101) R ( H c ) = (cid:82) Q − m τ dq d Γ[ H b → H c τν ] dq (cid:82) Q − m τ dq d Γ[ H b → H c (cid:96)ν ] dq , (cid:96) = e, µ , (22)is also considered, in which the relative phase-space sup-pression for the tauonic mode is factored out. For in-stance, the SM predictions are, using the ﬁt resultsof (Bernlochner et al. , 2017) (cid:101) R ( D ) = 0 . , (cid:101) R ( D ∗ ) = 0 . , (23)with a correlation coeﬃcient of 0 .

2. Longitudinal and polarization fractions

In the helicity basis for the D ∗ polarization, the D ∗ → Dπ decay amplitudes within B → ( D ∗ → Dπ ) lν decaysare simply L = 1 spherical harmonics e iλφ v Y ,λ ( θ v ), withrespect to the helicity angles deﬁned in Fig. 1. That is,the B → ( D ∗ → Dπ ) lν amplitudes may be expressedin the schematic form (cid:80) λ A λ [ B → D ∗ lν ]( θ l , φ l − φ v ) × Y ,λ ( θ v ). The D ∗ longitudinal polarization fraction F L,l ( D ∗ ) = Γ λ =0 [ B → D ∗ lν ]Γ[ B → D ∗ lν ] , (24) Another common notation is F L,τ ( D ∗ ) = F D ∗ L . Table II SM predictions for the D ∗ longitudinal fraction andthe τ polarization in B → D ( ∗ ) . We also show simple arith-metic averages of the predictions and uncertainties. TheCLN-based predictions shown below the line is not includedin the arithmetic average. Inputs F L,τ ( D ∗ ) F L,(cid:96) ( D ∗ ) P τ ( D ∗ ) P τ ( D )BLPR, ∼ /m c ,LCSR a . − . . ∼ /m c b . − . c . . − . . . . − . . d . a (Huang et al. , 2018), using the ﬁt of (Jung and Straub, 2019) b (Jaiswal et al. , 2020), using Belle 2019 data c Using the ﬁt of (Bernlochner et al. , 2017). The correlation between P τ ( D ∗ ) and P τ ( D ) is ρ = 0 . d (Alok et al. , 2017) thus arises as a physical quantity in B → ( D ∗ → Dπ ) lν decays, via the marginal diﬀerential rate1Γ d Γ B → ( D ∗ → Dπ ) lν d cos θ v = 32 (cid:104) F L,l cos θ v + (1 − F L,l ) sin θ v (cid:105) . (25)The interference terms between amplitudes with diﬀer-ent λ vanish under integration over φ l − φ v . Similar to R ( D ( ∗ ) ), theory uncertainties in | V cb | are factored out of F L,l . Some recent (and new) SM predictions for F L,τ ( D ∗ )are provided in Table II, using a variety theoretical in-puts. We also include an SM prediction for F L,(cid:96) ( D ∗ ).A similar analysis may be applied to τ → hν decayamplitudes within B → D ( ∗ ) ( τ → hν )¯ ν . For example,in the helicity basis for the τ , the τ → πν amplitudesare the j = Wigner- D functions e iφ h / sin( θ h /

2) or e − iφ h i/ cos( θ h / λ τ = ∓ , respectively, where thehelicity angles θ h and φ h are deﬁned in Fig. 1. The τ polarization P τ ( D ( ∗ ) ) = (cid:0) Γ λ τ =+ − Γ λ τ = − (cid:1) [ B → D ( ∗ ) τ ν ]Γ[ B → D ( ∗ ) τ ν ] , (26)is a physical quantity in B → D ( ∗ ) ( τ → πν )¯ ν decays, viathe marginal diﬀerential rate1Γ d Γ B → D ( ∗ ) ( τ → πν )¯ ν d cos θ h = 12 (cid:104) P τ ( D ( ∗ ) ) cos θ h (cid:105) . (27)The interference terms between amplitudes with diﬀerent λ τ vanish under integration over φ τ − φ h . This generalizesto other ﬁnal states, such as h = ρ , 3 π as1Γ d Γ B → D ( ∗ ) ( τ → hν )¯ ν d cos θ h = 12 (cid:104) α h P τ ( D ( ∗ ) ) cos θ h (cid:105) , (28)in which α h is the analyzing power, that depends on theﬁnal state h . In particular the pion is a perfect polarizer, I.E Theory of Semileptonic Decays α π = 1, while α ρ = (1 − m ρ /m τ ) / (1 + 2 m ρ /m τ ). Justas for F L,τ ( D ∗ ), some recent (and new) SM predictionsfor P τ ( D ( ∗ ) ) are provided in Table II, using a variety ofdiﬀerent theoretical inputs. The missing energy in the τ decay means that θ h is reconstructible only up to 2-foldambiguities in present experimental frameworks. E. Excited and other states

So far we have mainly discussed the ground statemeson transitions B → D ( ∗ ) lν . However, much ofthe above discussion can be extended to excited charmstates, baryons, charm-strange hadrons, or double heavyhadrons. Several of these process exhibit fewer HQ sym-metry constraints or greater theoretical cleanliness com-pared to the ground states. This may be exploited togain higher sensitivity to NP eﬀects or better insight orcontrol over theoretical uncertainties, such as 1 /m c con-tributions.Four orbitally excited charm mesons, collectively la-belled as the D ∗∗ , comprise in spectroscopic notation,the states D ∗ ∼ P , D (cid:48) ∼ P , D ∗ ∼ P and the D ∼ P . In the language of HQ symmetry, the D ∗ and D (cid:48) ( D and D ∗ ) furnish a heavy quark doublet whosedynamics is described by the s P(cid:96) = 1 / + ( s P(cid:96) = 3 / + )HQET. The 1 / + doublet is quite broad, with widths ∼ . . / + states are an orderof magnitude narrower. The B → D ∗∗ lν decays produceimportant feed-down backgrounds to B → D ( ∗ ) lν .Several of the B → D ∗∗ form factors vanish at leadingorder in the heavy-quark limit at zero recoil, so that thehigher-order O (1 /m c,b ) corrections become important, asincluded in the LLSW parametrization (Leibovich et al. ,1997, 1998). This can lead to higher sensitivities to var-ious NP currents, compared to the ground states, suchthat these decays must be incorporated consistently, es-pecially for LFUV analyses with beyond the SM contri-butions. The current SM predictions for all four modes,from ﬁts to Belle data including NLO HQET contribu-tions at O ( α s , /m c,b ), are (Bernlochner and Ligeti, 2017;Bernlochner et al. , 2018a) R ( D ∗ ) = 0 . , R ( D (cid:48) ) = 0 . , R ( D ) = 0 . , R ( D ∗ ) = 0 . . (29)These are smaller than R ( D ( ∗ ) ) because of the smallerphase space and reduced w range. An additional usefulquantity is the ratio for the sum of the four D ∗∗ states R ( D ∗∗ ) = (cid:80) X ∈ D ∗∗ Γ[ B → Xτ ¯ ν ] (cid:80) X ∈ D ∗∗ Γ[ B → X(cid:96) ¯ ν ] = 0 . , (30) The D (cid:48) is also often denoted by D ∗ . taking into account correlations in the SM predictions.An identical discussion proceeds for B s → D ( ∗ , ∗∗ ) s lν decays, with the light spectator quark replaced by astrange quark. The typical size of ﬂavor SU (3) break-ing suggests ∼

20% corrections compared to the predic-tions for B → D ( ∗ , ∗∗ ) . Lattice studies are available for B s → D s (McLean et al. , 2020) beyond zero-recoil, withthe prediction R ( D s ) = 0 . , (31)and there is some evidence of relative insensitivity of thematrix elements to the (light) spectator quark (McLean et al. , 2019). A recent analysis for B ( s ) → D ( ∗ )( s ) (Bor-done et al. , 2020) combines model-dependent QCDSRand LCSR inputs, even though the charm hadron is notultra-relativistic. This analysis predicts R ( D ) = 0 . , R ( D s ) = 0 . R ( D ∗ ) = 0 . , R ( D ∗ s ) = 0 . , (32)though the resulting R ( D ∗ ) prediction is notably in ten-sion with the prior predictions in Table I. At the LHC,or on the Z peak, non-negligible feed-downs to R ( D ∗ )arise from B s → D (cid:48) s τ ν decays, because of their subse-quent decay to D ( ∗ ) τ νX , that must be taken into ac-count. Likewise B s → D ∗ s τ ν decays may feed-down to R ( D ): see Sec. IV.C.The light degrees of freedom in the ground statebaryons Λ b,c have spin-parity s P(cid:96) = 0 + , corresponding tothe simplest, and therefore most constrained, HQET. Inparticular, the Λ b → Λ c form factors receive hadroniccorrections to the leading order IW function only at1 /m c,b . Beyond zero-recoil lattice data is available forboth SM and NP form factors (Detmold et al. , 2015).Predictions for Λ b → Λ c τ ν , however, are at presentmore precise when LQCD results are combined withdata-driven ﬁts for Λ b → Λ c (cid:96)ν plus HQET relations.In particular, a data-driven HQET-based form factorparametrization, when combined with the lattice data,provides the currently most precise prediction (Bern-lochner et al. , 2018b) R (Λ c ) = 0 . , (33)as well as the ability to directly extract or constrain the1 /m c corrections. The latter are found to be consistentwith HQ symmetry power counting expectations. Similartechniques will be applicable to the two Λ ∗ c excited stateswith s P(cid:96) = 1 − (B¨oer et al. , 2018; Leibovich and Stewart,1998), once data is available. At present, predictions for R (Λ ∗ c ) may be derived using a constituent quark modelapproach (Pervin et al. , 2005) similar to ISGW2, yielding R (Λ ∗ c (2595)) (cid:39) .

16 and R (Λ ∗ c (2625)) (cid:39) . B c → J/ψ ( → (cid:96)(cid:96) ) lν provides an extremely clean signature to test LFUV. The I.H Theory of Semileptonic Decays B c and J/ψ (or the pseudoscalar η c ): They cannot be thought of as asingle heavy quark dressed by brown muck, nor do we ex-pect approximate conservation of the spin of the heavyquarks in the underlying b → c transition. Hence anHQET description cannot be used for these modes. Avariety of quark model based analyses and predictionshave been conducted, with wide-ranging predictions for R ( J/ψ ) ∼ . .

4. A recent model-independent com-bined analysis for B ( s ) → D ( ∗ )( s ) and B c → J/ψ and η c ,making use of a combination of dispersive bounds, latticeresults and HQET where applicable, provided a predic-tion R ( J/ψ ) = 0 . et al. , 2019). A subse-quent LQCD result provides the high precision predic-tion (Harrison et al. , 2020b) R ( J/ψ ) = 0 . . (34)Preliminary lattice results for the B c → η c form factorsbeyond zero recoil are also available (Colquhoun et al. ,2016). F. b → ulν processes The dispersive analysis used in Sec. II.C.1 toparametrize the form factors for B → D ( ∗ ) may alsobe employed for the light hadron b → ulν processes.For B → πlν in particular, signiﬁcant simpliﬁcationsarise because there is only a single possible subthresh-old resonance—the B ∗ —for the f + form factor, and nosubthreshold resonance for f . Combining this with gen-eral analyticity properties of the B → π matrix element,leads to the BCL parametrization (Bourrely et al. , 2009).Expanding in z = z ( q , q ) f + ( q ) = 11 − q /m B ∗ N (cid:88) j =0 b + j (cid:104) z j − ( − j − N jN z N (cid:105) ,f ( q ) = N (cid:88) j =0 b j z j , (35)where N is the truncation order. Lattice results beyondzero recoil are available for all B → π form factors (Bai-ley et al. , 2015a,b), that can be incorporated into globalﬁts to available experimental data. The SM predictionis (Bernlochner, 2015) R ( π ) = 0 . . (36)Higher-twist LCSR results are available for the B → ρ and B → ω SM and NP form factors, parametrized bythe optimized z = z ( q , q ) expansion (Bharucha et al. ,2016). These results may be applied to obtain a corre-lated, beyond zero recoil ﬁt between the SM and NP form factors and the measured q spectra of the correspond-ing light-lepton modes. The SM predictions from this ﬁtare (Prim et al. , 2020) R ( ρ ) = 0 . , R ( ω ) = 0 . . (37) G. Inclusive processes

The inclusive process B → X c lν , where X c is a single-charm (multi)hadron ﬁnal state of any invariant mass,admits a diﬀerent, cleaner theoretical description com-pared with the above exclusive processes. For instance,in the limit m b → ∞ , the inclusive process is describedsimply by the underlying b → clν free quark decay, ratherthan in terms of an unknown Isgur-Wise function.The square of the inclusive matrix element |(cid:104) X c | J | B (cid:105)| can be re-expressed in terms of the time-ordered for-ward matrix element (cid:104) B | T ( J † J ) | B (cid:105) . The latter can becomputed via an OPE order-by-order in 1 /m b and α s ,yielding theoretically clean predictions. State-of-the-artpredictions include 1 /m b terms (Ligeti and Tackmann,2014) and two-loop QCD corrections (Biswas and Mel-nikov, 2010), that may be combined to generate the pre-cision prediction (Freytsis et al. , 2015) R ( X c ) = 0 . , (38)as well as precision predictions for the dilepton invariantmass and lepton energy distributions. Because the theo-retical uncertainties in B → X c lν are of a diﬀerent originto the exclusive modes, the measurement of B → X c τ ν would provide a hadronic-model independent cross-checkof lepton ﬂavor universality (see Sec. VI.C). H. New Physics operators

New Physics (NP) may enter the b → cτ ν processesvia a heavy mediator, such that the semileptonic decayis generated by four-Fermi operators of the form O XY = c XY Λ (cid:0) c Γ X b (cid:1)(cid:0) ¯ τ Γ Y ν τ ) , (39)where Γ X ( Y ) is any Dirac matrix with X ( Y ) labelingthe chiral structure of the quark (lepton) current, and c XY is a Wilson coeﬃcient deﬁned at scale µ ∼ m c,b .The Wilson coeﬃcient is normalized against the SM suchthat Λ eﬀ = (2 √ G F V cb ) − / (cid:39)

870 GeV. If we denote by M the characteristic scale of an ultraviolet (UV) com-pletion that matches onto the eﬀective NP operators inEq. (39), then order 10–20% variations in R ( D ( ∗ ) ) orother observables from SM predictions typically probe M ∼ Λ eﬀ / √ c XY ∼ few TeV. This is tantalizingly inrange of direct collider measurements and nearby the nat-ural scale for UV completions of electroweak dynamics. II.A Experimental Methods

12A common basis choice for Γ X is the set of chiral scalar,vector and tensor currents: P R,L , γ µ P R,L , and σ µν P R,L ,respectively. Assuming only SM left-handed neutrinos,the lepton current is always left-handed, and the tensorcurrent may only be left-handed. It is common to writethe ﬁve remaining Wilson Coeﬃcients as c XY = c SR , c SL , c VR , c VL and c T . We use this notation for the Wil-son coeﬃcients hereafter. As for the SM, the NP leptonicamplitude still takes the form D jm ,m ( θ l , φ l ), with j = 0or 1, and | m , | ≤ j , and the structure of the diﬀerentialdecay rate resembles Eq. (9), but with additional depen-dencies on NP Wilson coeﬃcients, w , and r .The (pseudo)scalar and tensor operators run under theRenormalization Group (RG) evolution of QCD, whilethe vector and axial vector operators correspond to con-served currents and do not (for this reason the normal-ization of Eq. (3) is well-deﬁned). At one-loop order inthe leading-log approximation, the running of c SR,SL,T isdominated by contributions below the top quark mass m t , and only weakly aﬀected by variations in M ∼ Λ eﬀ .Electroweak interactions, however, may induce mixingbetween c SR,SL,T , that can become non-negligible for RGevolution above the weak scale (Gonz´alez-Alonso et al. ,2017). RG evolution from M (cid:39) Λ eﬀ > m t to µ (cid:39) √ m c m b generates at leading-log order c SR,SL ( µ ) /c SR,SL ( M ) (cid:39) . , c T ( µ ) /c T ( M ) (cid:39) . . (40)These running eﬀects are particularly important in trans-lating the low scale eﬀective ﬁeld theory (EFT) implica-tions of b → cτ ν measurements to collider measurementsat high scales. I. Connection to other processes

LFUV in b → clν necessarily implies violation in thecrossed process B c → lν . The latter decays are extremelytheoretically clean: Their tauonic versus leptonic LFUVratios are simply the ratios of chiral suppression and 2-body phase space factors, i.e. m τ (1 − r τ ) /m (cid:96) (1 − r (cid:96) ) ,in which r l = m l /m B c . These ratios are precisely known.In the SM, the branching ratio B [ B c → lν ] = τ B c G F | V cb | m B c f B c r l (cid:0) − r l (cid:1) / π , (41)in which the decay constant f B c (cid:39) . et al. , 2015), and the B c lifetime, τ B c = 0 . × − s is well-measured (Zyla et al. ,2020). In particular, in the SM one predicts B [ B c → τ ν ] (cid:39) . × ( | V cb | / . .In the presence of NP, the NP Wilson coeﬃcients gen-erate an additional factor B [ B c → τ ν ] = B SM (cid:12)(cid:12)(cid:12)(cid:12) c VL − c VR + m B c ( c SR − c SL ) m τ ( m b + m c ) (cid:12)(cid:12)(cid:12)(cid:12) , (42) where m c,b are MS quark masses arising from equationsof motion. Because the NP pseudoscalar current inducesa chiral ﬂip, its chiral suppression is lifted by a factorof m B c /m τ ∼ . V − A current. This leadsto large tauonic branching ratio enhancements, that maythen be in tension with naive expectations that the B c hadronic branching ratios ∼ et al. , 2017; Bardhan and Ghosh, 2019; Li et al. , 2016). A corollary is that a future measurementor bounds of B [ B c → τ ν ] alone would tightly constrainthe NP pseudoscalar contributions.In the absence of any NP below the electroweak scale,the NP eﬀective operators in Eq. (39) must match onto anelectroweak-consistent EFT constructed from SM quarkand lepton doublets and singlets under SU (2) L × U (1) Y .In particular, because the SM neutrino belongs to an elec-troweak lepton doublet, L L , then electroweak symmetryrequires the presence of at least two electroweak dou-blets in any operator that generates the b → cτ ν decay.(An exception applies if right-handed sterile neutrinosare present.) In any given NP scenario, this may gener-ate relations between b → cτ ν and other processes, thatarise when at least one of the four fermions is replacedby its electroweak partner. For example, various minimalNP models, depending on their ﬂavor structure, may besubject to tight bounds from the rare b → sνν or b → sτ τ decays or bounds on Z → τ τ or W → τ ν branching ra-tios (Freytsis et al. , 2015; Sakaki et al. , 2013), or the high- p T scattering pp → τ τ or τ ν (Greljo et al. , 2019; Greljoand Marzocca, 2017). Ultraviolet completions with non-trivial ﬂavor structures may further generate relations tocharm decay processes, or b → s(cid:96)(cid:96) . The latter is particu-larly intriguing, because of an indication for light leptonuniversality violation in the ratios (Aaij et al. , 2017b,2019c) R K ( ∗ ) ≡ Γ[ B → K ( ∗ ) µµ ]Γ[ B → K ( ∗ ) ee ] , (43)at approximately the 2 . σ level in each mode (seeSec. VI.E). Extensive literature considers possible com-mon origins of LFUV in semitauonic processes withLFUV in these rare decays. See e.g. (Bhattacharya et al. ,2015; Buttazzo et al. , 2017; Calibbi et al. , 2015; Kumar et al. , 2019), among many others, for extensive discus-sions of combined explanations for semileptonic and raredecay LFUV anomalies. III. EXPERIMENTAL METHODSA. Production and detection of b -hadrons Since the discovery of the b quark in 1977 (Herb et al. ,1977), large samples of b -hadrons have been produced atcolliders such as CESR, LEP, or Tevatron. However, itwas not until the advent of the B factories and the LHC, II.A Experimental Methods Table III Approximate number of b -hadrons produced and expected at the B factories (Altmannshofer et al. , 2019; Bevan et al. , 2014) and at the LHCb experiment (Albrecht et al. , 2019), including some of the latest developments (B´ejar Alonso et al. , 2020). The LHCb numbers take into account an average geometrical acceptance of about 25% for 2 < η <

5. Note thatthe overall B reconstruction eﬃciencies are usually quite diﬀerent between B factories and LHCb (see text). The two valuesof integrated luminosities and center-of-mass energies shown for Belle and Belle II correspond to data taking at the Υ (4 S ) and Υ (5 S ) resonances, respectively. The B -factory experiments also recorded data sets at lower center-of-mass energies (below theopen beauty threshold) that are not included in this table.Experiment BABAR

Belle Belle II LHCbRun 1 Run 2 Runs 3–4 Runs 5–6Completion date 2008 2010 2031 2012 2018 2031 2041Center-of-mass energy 10.58 GeV 10.58/10.87 GeV 10.58/10.87 GeV 7/8 TeV 13 TeV 14 TeV 14 TeV bb cross section [nb] 1.05 1.05/0.34 1.05/0.34 (3.0/3.4) × . × . × . × Integrated luminosity [fb − ] 424 711/121 (40 / × B mesons [10 ] 0.47 0.77 40 170 580 4,200 32,000 B + mesons [10 ] 0.47 0.77 40 170 580 4,200 32,000 B s mesons [10 ] - 0.01 0.5 40 140 1,000 7,600Λ b baryons [10 ] - - - 90 300 2,200 16,000 B c mesons [10 ] - - - 1.3 4.4 32 240 with their even larger samples and specialized detectors,that the study of third generation LFUV in B mesonsbecame feasible. This is because of the stringent analy-sis selections that are required to achieve adequate signalpurity when reconstructing ﬁnal states that include mul-tiple unreconstructed neutrinos. The B factories (Bevan et al. , 2014), KEKB in Japan and PEP-II in the UnitedStates, took data from 1999 to 2010. Their detectors,Belle (Abashian et al. , 2002) and B A B AR (Aubert et al. ,2013), recorded over a billion of BB events originatingfrom clean e + e − collisions. The LHCb detector (Aaij et al. , 2015a; Alves et al. , 2008) at the CERN LHC, whichstarted taking data in 2010, has recorded an unprece-dented trillion bb pairs as of 2020, which allows it to com-pensate for the more challenging environment of pp col-lisions. The recently commissioned Belle II experimentand the LHCb detector, to be upgraded in 2019–21 and2031, are expected to continue taking data over the nextdecade and a half, surpassing the current data samplesby more than an order of magnitude. In the following,we describe how b -hadrons are produced and detected atthese facilities. Table III summarizes the number of b -hadrons produced and expected at the B factories andat the LHCb experiment. Other current experiments might also be able to make contri-butions to semitauonic LFUV measurements in the future. Forinstance, the CMS experiment at the LHC recorded in 2018 alarge (parked) sample of unbiased b -hadron decays, with the pri-mary goal of measuring the R K ( ∗ ) ratios. This sample couldconceivably also be used to measure semitauonic decays if, e.g.,the challenges arising from the multiple neutrinos in the ﬁnalstate can be overcome.

1. The B factories KEKB and PEP-II produced B mesons by collidingelectron and positron beams at a center-of-mass energyof 10 .

579 GeV. At this energy, e + and e − annihilationproduces Υ (4 S ) mesons in about 24% of the hadroniccollision processes, with the production of cc and otherlight quark pairs accounting for the remaining 76%. To-gether with other processes producing pairs of fermions,the latter form the so-called continuum background.The Υ (4 S ) meson is a bb bound state which, as a resultof having a mass only about 20 MeV above the BB pro-duction threshold, decays almost exclusively to B + B − or B B pairs. Some limited running away from the Υ (4 S )resonance was performed in order to study the contin-uum background and the properties of the bottomoniumresonances Υ (1 S ) − Υ (5 S ). The largest data set producedby KEKB was used to study B s mesons obtained from Υ (5 S ) decays. However, the resulting B ( ∗ ) s B ( ∗ ) s data sam-ple was small, about 3% of the total BB sample as shownin Table III.On the one hand, compared to hadron colliders, the bb production cross section in lepton colliders such asthe B factories is much smaller: even at the (so far)highest instantaneous luminosity of 2 . × cm − s − achieved by SuperKEKB in the Summer of 2020, BB pairs were produced only at a rate of about 25 Hz. On theother hand, one of the signiﬁcant advantages of collidingfundamental particles like electrons and positrons is thatthe initial state is fully known, i.e., nearly 100% of the e + e − energy is transferred to the BB pair. This featurecan be exploited by tagging techniques (Sec. III.C.1) thatreconstruct the full collision event and can determine themomenta of missing particles such as neutrinos, so long II.A Experimental Methods that form the hadronic final state X . The other particles in theevent also need to be reconstructed to infer the kinematics ofthe undetected neutrino either from the missing energy andmomentum in the event or from the reconstruction of thesecond B meson. For this reason, a good hermeticity of thedetectors is important.Figure 5 shows schematics of the Belle and BABAR detectors. Both detectors have a similar overall design.They are laid out in a cylindrical geometry and feature thefollowing subdetector components (from inside to outside):

TABLE II. Operating parameters of the e þ e − colliders running atthe ϒ ð S Þ resonance. For the asymmetric-energy colliders, LER andHER denote the low-energy e þ ring and high-energy e − ring,respectively. KEKB PEP-II CESRBeam energy (GeV) LER: 3.5 LER: 3.1 5.29HER: 8.0 HER: 9.0Lorentz boost βγ − s − ) . × . × . × FIG. 5.

Side views of (a) the Belle and (b)

BABAR detectors. The acronyms used for the subdetector components of Belle are SVD =silicon vertex detector, CDC = central drift chamber, PID = particle identification system, TOF = time-of-flight counter, CsI = CsI crystalcalorimeter, KLM ¼ K L = muon system, and EFC = extreme forward calorimeter. From Abashian et al. , 2002 and Aubert et al. , 2002. Jochen Dingfelder and Thomas Mannel: Leptonic and semileptonic decays of B mesonsRev. Mod. Phys., Vol. 88, No. 3, July – September 2016 035008-10

REVIEW RESEARCH

8 J U N E 2 0 1 7 | V O L 5 4 6 | N A T U R E | 2 2 9 the process ϒ → → + − e e S BB (4 ) . These BB pairs can be tagged by the reconstruction of a hadronic or semileptonic decay of one of the two B mesons, referred to as B tag . If this decay is correctly reconstructed, all remaining particles in the event originate from the other B decay, either a signal leptonic or semileptonic B decay or another B decay passing the selection criteria.The BaBar and Belle collaborations have independently developed two sets of algorithms to tag BB events. The hadronic tag algorithms search for the best match between one of more than a thousand possible decay chains and a subset of all detected particles in the event. The efficiency for finding a correctly matched B tag is unfortunately quite small, 0.3%. The benefit of reconstructing all final state particles is that the total energy, E miss , and momentum vector, p miss , of all undetected particles of the other B decay can be inferred from energy and momentum conservation. The invariant mass squared of all undetected particles, = − p m E miss2 miss2 miss2 , is used to distinguish events with one neutrino ≈ m ( 0) miss2 from events with multiple neutrinos or other missing particles > m ( 0) miss2 . The semileptonic tag algorithms exploit the large branching fractions for B decays involving a charm meson, a charged lepton and associated neutrino, ν → ∗ + B D ℓ ℓ ( ) , with ℓ + = e + , µ + . The efficiency for finding these tag decays is about 1%. However, the presence of the neutrino leads to weaker constraints on the B tag and signal B decay.Measurements of τ ν → τ − − B decays are based on leptonic τ decays, τ ν ν → τ − − e e and τ µ ν ν → µ τ − − , and on semileptonic τ decays, τ − → π − ν τ and τ − → π − π ν τ , which together account for 70% of all τ − decays. Thus, the signature for signal events is a single charged particle, either a charged lepton, a π − , or a π − accompanied by a π , plus a B tag .The presence of multiple neutrinos precludes the use of kinematic con-straints to effectively suppress backgrounds from other B decays. A vari-able that is sensitive to backgrounds with additional photons or undetected charged particles due to efficiency and acceptance losses is E extra , the sum of the energy deposits in the calorimeter which are not associated with the tag or signal B decay. Figure 3 shows a E extra distribu-tion measured by the Belle collaboration for a subset of events with τ − → π − ν τ . Signal events have low values of E extra , while background events extend to higher values. The signal yield is determined from a fit to the data using signal and background distributions based on data control samples and Monte Carlo simulation. The sum of the fitted signal yields for the four subsamples of purely leptonic and semileptonic τ decays, corrected for the efficiency of the tag and signal B decays, is used to determine the τ ν → τ − − B branching fraction.As shown in Fig. 4, current measurements by the Belle and BaBar collaborations are of limited precision owing to very small signal samples and high backgrounds, and uncertainties in the B tag K – (cid:81) e (cid:83) + (cid:81) (cid:87) e – (cid:81) (cid:87) p – p (cid:83) + (cid:83) + K – (cid:80) – D B a b Figure 2 | Belle and LHCb single-event displays illustrating the reconstruction of semileptonic B -meson decays. Trajectories of charged particles are shown as coloured solid lines; energy deposits in the calorimeters are depicted by red bars. a , The Belle display is an end view perpendicular to the beam axis, with the silicon detector in the centre (small orange circle) and the device measuring the particle velocity (dark purple polygon). This is a ϒ (4 S ) → B + B − event, with τ ν → τ − − B D , D → K − π + and τ ν ν → τ − − e e , and the B + decaying to five charged particles (white solid lines) and two photons. The trajectories of undetected neutrinos are marked as dashed yellow lines. b , The LHCb display is a side view, with the proton beams indicated as a white horizontal line with the interaction point far to the left, followed by the dipole magnet (white trapezoid) and the Cherenkov detector (red lines). The area close to the interaction point is enlarged above, showing the tracks of the charged particles produced in the pp interaction, the B path (dotted orange line), and its decay τ ν → τ ∗ + − B D , with D * + → D π + and D → K − π + , plus the µ − from the decay of a very short-lived τ − . E extra (GeV) E v e n t s p e r . G e V BB background BB backgroundSignal Figure 3 | Extraction of the τ ν → τ − − B yield from Belle data. Shown, for a subset of events with τ − → π − ν τ candidates (solid histogram), is the result of a fit to the E extra distribution (data points with statistical errors) for the sum of signal and backgrounds from BB and non- BB events . The green histogram at the bottom indicates the predicted signal distribution. Image adapted from ref. 27, American Physical Society. © 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. Figure 2 Left: Side view of the Belle detector. See (Abashian et al. , 2002) for further detail on the subdetectors and theiracronyms. The

BABAR detector has a similar conﬁguration. Right: view perpendicular to the beam axis. The displayed eventis reconstructed as a Υ (4 S ) → B + B − candidate, with B − → D τ − ν τ , D → K − π + and τ − → e − ν τ ν e , and the B + decayingto ﬁve charged particles (white solid lines) and two photons. The directions of undetected neutrinos are indicated as dashedyellow lines. From (Abashian et al. , 2002; Ciezarek et al. , 2017). as the detectors are capable of reliably reconstructing allof the visible particles. The B A B AR and Belle detectorsmanaged to cover close to 90% of the total solid angleby placing a series of cylindrical subdetectors around theinteraction point and complementing them by endcapsthat reconstructed the particles that were ejected almostparallel to the beam pipe. This is sketched in Fig. 2.The speciﬁc technologies employed in both B -factorydetectors have been described in detail in (Bevan et al. ,2014). Four or ﬁve layers of precision silicon sensorsplaced close to the interaction point reconstruct the de-cay vertices of long-lived particles, as well as the ﬁrst ≈

10 cm of the tracks left by charged particles. Fortyto ﬁfty layers of low-material drift chambers measure thetrajectories and ionization energy loss as a function ofdistance ( dE/dx ) of charged particles. Time-of-ﬂight andCherenkov systems provide particle identiﬁcation (PID)that allow kaon/pion discrimination. Crystal calorime-ters measure the electromagnetic showers created by elec-trons and photons. A solenoid magnet generates the1 . K L PID.Between 1998 and 2008–10, the B A B AR and Belle de-tectors recorded a total of 471 and 772 million BB pairs,respectively. These large samples, still being analyzed atthe time of this writing, allowed for the ﬁrst measure-ment of CP violation in the B system, the observationof B mixing, as well as many other novel results (Bevan et al. , 2014). These further included the ﬁrst observa-tions of B → D ( ∗ ) τ ν decays (see Sec. IV), which in turnbegan the study of third generation LFUV: the focus of this review. The success of the B factories has led to theupgrade of the accelerator facilities at KEKB, so calledSuperKEKB (Akai et al. , 2018), such that it will be ca-pable of delivering instantaneous luminosities 30 timeshigher than before. The upgraded Belle detector, BelleII (Abe, 2010), started taking data in 2018 with the aimof recording a total of over 40 billion BB pairs. TheLFUV prospects for Belle II are discussed in Sec. VII.A.2.

2. The LHCb experiment

At hadron colliders such as the LHC, b quarks are pre-dominantly pair-produced in pp collisions via the gluonfusion process gg → bb plus subleading quark fusion con-tributions, with an approximate production cross-section σ ( bb ) ∼ µ b at √ s = 13 TeV, scaling approximatelylinearly in √ s (Aaij et al. , 2017a). Electroweak produc-tion cross-sections for single or pairs of b quarks via Drell-Yan processes, Higgs or top quark decays are ﬁve or moreorders of magnitude smaller, with the largest such cross-section σ ( Z → bb ) ∼

10 nb. As a result, b quarks areeﬀectively always accompanied in LHC collisions by acompanion b quark. This feature is extremely importantfor unbiased trigger strategies and for CP violation stud-ies, as it can be exploited to establish the beauty contentat the production vertex of neutral B mesons.At leading order, the hadronization of a b quark at theLHC is quite similar to the one observed in detail by theLEP experiments. For instance, the momentum distri-bution of the non b -hadron fragments, which is relevantfor same-side tagging studies, is well described by LEP-inspired Monte Carlo simulations (Sj¨ostrand et al. , 2015).More important is the relative production of the various II.A Experimental Methods Figure 3 Left: Side view of the LHCb detector. See (Aaij et al. , 2015a; Augusto Alves Jr et al. , 2008) for further details onthe subdetectors and their acronyms. Right: side view of an event display for a B → D ∗ + τ − ν τ decay. The area around theinteraction point is enlarged in the inset at the top. The trajectory of the B meson is indicated with a dotted orange line, andthe trajectories of the particles from the subsequent D ∗ + → D π + , D → K − π + , and τ − → µ − ν µ ν τ decays are illustratedwith thick colored lines. Adapted from (Aaij et al. , 2015a; Ciezarek et al. , 2017). b -hadron species: the main features—dominant produc-tion of B and B + mesons, and sizeable production frac-tion of B s and Λ b —are the same, except that a muchlarger Λ b production fraction is observed for p T (momen-tum transverse to the beam axis) below 10 GeV (Aaij et al. , 2019a). LHCb can also study the decays of B c mesons, in spite of its very low production rate, approx-imately 0 .

6% of the B + production cross-section (Aaij et al. , 2015b). As discussed in Secs. II.E and II.I, B c mesons provide a very interesting laboratory for testingLFUV in B c → J/ψ τ ν or B c → τ ν decays.The parton center-of-mass energy required to producea b -hadron pair at threshold is far smaller than the totalavailable collision energy in the pp system, leading to theproduction of a signiﬁcant fraction of bb pairs with verylarge forward or backward boosts. This characteristic isthe basis of the LHCb experimental concept, which stud-ies the bb pairs produced within a 400 mrad cone cover-ing the forward region, corresponding to a pseudorapidity2 ≤ η ≤

5. Despite this very small solid angle, the LHCbdetector captures ∼

15% of the full bb cross-section.Within this acceptance, the b -hadrons have a typicaltransverse momentum, p T , of 10 GeV, corresponding toan overall energy of ∼

200 GeV. This in turn corre-sponds to a typical boost factor of about 50, resulting in amean ﬂight distance of over 2 cm for each electroweakly-decaying ground-state b -hadron: namely B , + , B s , B c ,or Λ b . The sophisticated silicon trackers used in theLHCb detector provide a typical position resolution of300 µ m for the B vertex along its ﬂight direction, result-ing in ﬂight distance signiﬁcances between the b -hadrondecay vertex and its primary vertex (PV) of over 100 σ .This precision leads to extremely clean signals even for high-multiplicity decay channels, provided the primaryproduction vertex can be identiﬁed.The LHCb luminosity is kept low enough so that themean number of primary vertices per event until 2018was between 1 and 2. This number is expected to rise toabout 5 after the 2019–21 upgrade and possibly 50 afterthe 2031 upgrade. The longitudinal size of the LHCb lu-minous region is 20 cm, so that with only a handful of pp interactions in a given event, the primary vertex mis-construction is kept to a very low level. The ATLAS andCMS experiments typically accumulate 50 primary ver-tices in a given event (rising to 200 after 2027) and facetherefore a very diﬀerent challenge. Nevertheless, theyare capable of cleanly reconstructing low multiplicity b -hadron decays thanks to their large coverage and high-granularity subdetectors. It should be stressed, how-ever, that for semitauonic b -hadron decays the goal isnot just to isolate a decay vertex from a primary vertex,but rather to identify a chain of vertices comprising thePV, the b -hadron decay, and, in the case of hadronic- τ measurements, the τ decay. At the LHC, this is currentlyonly feasible at LHCb.As in the B factories, PID capabilities are critical toproperly identify b -hadron decays. For instance, at ahadron collider, misidentifying a pion as a kaon couldlead to confusing a B s meson for a B meson, and iden-tifying a pion as a proton can lead to a Λ b baryon im-personating a B meson. PID information is provided bythe two ring-imaging Cherenkov (RICH) detectors shownin Fig. 3 left.Table III lists the known production rates for allground-state b -hadron species, at both LHCb and B fac-tories. While the geometrical acceptance is included for II.B Experimental Methods B factories. These require-ments limit the useful yield to about 0 .

1% or less of theavailable sample. As an example, LHCb can reconstruct5 × events per fb − for modes having a total branch-ing fraction into stable hadrons of ∼ − .Another feature of LHCb physics is the large produc-tion rate of excited b -hadron states: B ∗∗ , B ∗∗ s , Λ ∗∗ b canbe studied in detail, as well as baryons containing both b and s quarks, such as Ξ b , Ω b , and their excited states.These can be useful to study semitauonic decays because,as described in Sec. III.C.3, the decay B ∗∗ s → BK canprovide access to kinematic variables in the B center-of-mass frame via B tagging. B. Particle reconstruction

Ground state b -hadrons—i.e. hadrons decaying onlythrough ﬂavor changing electroweak currents—have life-times of the order of one picosecond. Thus, they decayfast enough that they must all be reconstructed fromtheir more stable decay products. At the same time,they live and ﬂy long enough so that their decay verticescan be separated from the vertex of the primary collision( e + e − in the case of the B factories and pp in the case ofLHCb). The reconstruction of these stable decay prod-ucts proceeds in a similar fashion for the B factories andthe LHCb experiment, with some key diﬀerences.

1. Charged particle reconstruction

The trajectories of charged particles—‘tracks’—are re-constructed based on the energy deposits left in thetrackers—‘hits’. The momenta of these particles are de-termined based on the bending of these trajectories in-duced by the magnetic ﬁelds in each detector. As shownin Figs. 2 and 3, charged particles follow helical trajec-tories in the B factories due to their solenoidal magneticﬁelds, while in LHCb the particles are simply deﬂectedby the dipole magnet. In either case, charged track re-construction proceeds with eﬃciencies of over 95%—for p >

300 MeV at the B factories (Bevan et al. , 2014)and p > et al. , 2015a)—and themomentum determination is achieved with a typical res-olution of 0 . b -hadron secondary verticesis of primary importance to distinguish signal from back-ground decays, especially in LHCb. In the B facto-ries (Bevan et al. , 2014), the decay vertices of the short-lived B and D mesons were reconstructed with a resolu-tion of 60 − µ m when they decayed inside the vertextrackers (about 80% of the time), and 100 − µ m whendecaying outside. LHCb reconstructs the impact param- eter of the tracks, that is, their distance to the primaryvertex in the plane transverse to the beam line, with animpressive resolution of 45 µ m for p T = 1 GeV, and downto 15 µ m for very high momenta tracks. As discussed inSec. III.A.2, the vertex resolution along the beam line isof the order of 250 µ m which, given the large boost ofmost particles at LHCb, is suﬃcient to suppress promptbackground processes by multiple orders of magnitude(Sec. IV.C.2).For both the B factories and LHCb, charged leptonshave generically clean signatures that can be diﬀerenti-ated from other types of particles with high eﬃciency.Electrons are reconstructed from tracks that match acluster in the electromagnetic calorimeter with the ap-propriate shape and energy; muons are generally iden-tiﬁed as tracks that leave hits in the outer muon detec-tors, with some additional inputs from the other subde-tectors. However, the performance of the two kinds ofexperiments diverges substantially in the details.At the B factories, both electrons and muons are re-constructed with eﬃciencies over 90% and with low mis-identiﬁcation rates, though the performance is generallybetter for electrons (see Fig. 4). For instance, a typical2 GeV electron is reconstructed with 96% eﬃciency and0 .

3% pion misidentiﬁcation probability, whereas a 2 GeVmuon would have 92% eﬃciency and 2 .

5% pion misiden-tiﬁcation probability. In contrast, at LHCb the electronreconstruction is much more challenging because of thelower granularity of the electromagnetic calorimeter andthe larger amount of material before it, compared to the B factories. A 20 GeV electron is reconstructed withabout 90% eﬃciency for a misidentiﬁcation rate of 2 . B factories. The right panelsof Figs. 4 and 5 show the separation achieved for sev-eral species of charged hadrons in some of the Cherenkovdetectors for B A B AR and LHCb, respectively.

2. Neutral particle reconstruction

Another key diﬀerence between B factories and LHCblies in the ability to eﬃciently reconstruct neutral par-ticles: primarily photons in the case of LFUV mea-surements. The low material in front of the B -factorycalorimeters, as well as their good resolution and granu- II.C Experimental Methods E ff i c i e n c y ( % ) + e - e E ff i c i e n c y ( % ) + (cid:2) - (cid:2) E ff i c i e n c y ( % ) + K - K p lab (GeV) p lab (GeV) p lab (GeV)Figure 3.6: Electron identiﬁcation eﬃciency in term of momentum in the laboratory frame forelectrons, pions, and kaons in the 0 . < θ < . E ff i c i e n c y ( % ) + µ - µ E ff i c i e n c y ( % ) + (cid:2) - (cid:2) E ff i c i e n c y ( % ) + K - K p lab (GeV) p lab (GeV) p lab (GeV)Figure 3.7: Muon identiﬁcation eﬃciency in term of momentum in the laboratory frame for muons,pions, and kaons in the 0 . < θ < . CHAPTER 3. PARTICLE RECONSTRUCTION WITH THE B A B AR DETECTOR –1 e µ (cid:2) K p d d E / d x p lab (GeV) e µ (cid:2) K p p lab (GeV) θ C ( m r a d ) Figure 3.5: Left: the dots show the measurement of d E/ d x with the DCH, and the curves theBethe-Bloch predictions. Right: measurement of θ C with the DIRC.The value of d E/ d x is the main discriminant between pions and kaons with momenta less than800 MeV, see Fig. 3.5 left. The DIRC measurement of the Cherenkov angle θ C provides furtherseparation between pions and kaons in the p ∈ (0 . , .

0) GeV range (Fig. 3.5 right).Electrons are primarily selected by the ratio of its measured EMC energy to the reconstructedmomentum,

E/p . This ratio is close to unity for electrons, since most of their energy is deposited inthe EMC. This variable is combined with information on the shape of the electromagnetic showerand d E/ d x in a multivariate method that selects the electron candidates.The performance of this selector was evaluated with Bhabha, radiative Bhabha, and e + e − → e + e − e + e − γ events. For p lab >

400 MeV, the electron reconstruction eﬃciency is greater than 95%and mostly independent of momentum (Fig. 3.6 left). The pion and kaon fake rates are smaller than1% because hadrons tend to deposit only a small fraction of its energy in the EMC.The muon identiﬁcation is also based on a multivariate method. It combines tracking informationwith the measurements by the IFR, DIRC and EMC. The number of IFR layers hit is one of themain selection variables, but the current algorithm is capable of reconstructing muons that do notreach the IFR based on the measurements of the other subsystems.The performance of the muon selector was evaluated with control samples of e + e − → µ + µ − ( γ )events. We choose a fairly loose selector that maximizes eﬃciency (greater than 80% for p lab > K S mesons as combinations of two pions that converge to a common vertex, andhave an invariant mass in the range 491 < m π + π − <

506 MeV. E ff i c i e n c y ( % ) + e - e E ff i c i e n c y ( % ) + (cid:2) - (cid:2) E ff i c i e n c y ( % ) + K - K p lab (GeV) p lab (GeV) p lab (GeV)Figure 3.6: Electron identiﬁcation eﬃciency in term of momentum in the laboratory frame forelectrons, pions, and kaons in the 0 . < θ < . E ff i c i e n c y ( % ) + µ - µ E ff i c i e n c y ( % ) + (cid:2) - (cid:2) E ff i c i e n c y ( % ) + K - K p lab (GeV) p lab (GeV) p lab (GeV)Figure 3.7: Muon identiﬁcation eﬃciency in term of momentum in the laboratory frame for muons,pions, and kaons in the 0 . < θ < . III.B - BaBar PID

BaBar

Electron momentum [GeV] Muon momentum [GeV] p lab [GeV] E ff i c i e n c y [ % ] E ff i c i e n c y [ % ] Figure 4 Examples of particle reconstruction performance for the

BABAR detector; the performance for the Belle detector issimilar. Left: electron reconstruction eﬃciency. Middle: muon reconstruction eﬃciency. Right: Cherenkov angle measurementfor diﬀerent particles species at

BABAR ’s Detector of internally reﬂected Cherenkov light (DIRC). Adapted from (Aubert et al. ,2013; Franco Sevilla, 2012).

Momentum (GeV/c)Momentum (GeV/c)

10 10 C h e r e n k o v A ng l e ( m r a d ) µ (cid:391)(cid:1) Figure 38: Reconstructed Cherenkov angle for isolated tracks, as a function of track momentumin the C F radiator [81]. The Cherenkov bands for muons, pions, kaons and protons are clearlyvisible. ring will generally overlap with several neighbouring rings. Solitary rings from isolatedtracks, where no overlap is found, provide a useful test of the RICH performance, sinceisolated rings can be cleanly and unambiguously associated with a single track. Figure 38shows the Cherenkov angle as a function of particle momentum using information fromthe C F radiator for isolated tracks selected in data ( ⇠

2% of all tracks). As expected,the events populate distinct bands according to their mass.

The average number of detected photons for each track traversing the Cherenkov radiatormedia, called the photoelectron yield ( N pe ), is another important measure of the perfor-mance of a RICH detector. The yields for the three radiators used in LHCb are measuredin data using two di↵erent samples of events [81]. The ﬁrst sample is representative ofnormal LHCb data taking conditions, and consists of the kaons and pions originating fromthe decay D ! K ⇡ + , where the D is selected from D ⇤ + ! D ⇡ + decays. The secondsample consists of low detector occupancy pp ! ppµ + µ events, which provide a cleantrack sample with very low background levels. In both samples, only high-momentumtracks are selected, to ensure that the Cherenkov angle is close to saturation.51 LHCb LHCb E l ec tr o n e c i e n c y ( % ) m i s I D r a t e ( % ) momentum, GeV/c momentum, GeV/c log L CALO ( e h ) >

0 log L CALO ( e h ) >

1 log L CALO ( e h ) >

2 log L CALO ( e h ) > L CALO ( e h ) >

0 log L CALO ( e h ) >

1 log L CALO ( e h ) >

2 log L CALO ( e h ) > Figure 36: Electron identiﬁcation performances for various log L CALO ( e h ) cuts: electroneciency (left) and misidentiﬁcation rate (right) as functions of the track momentum. any information from the calorimeter system ( e probe ). This second electron is then used toestimate the eciency of the electron ID.The eciency and the misidentiﬁcation rate as a function of the e probe momentum arepresented in Figure 36 for several cuts on log L CALO ( e h ). The electron identiﬁcationeciency is observed to be lower for p <

10 GeV/c. As expected, the higher momentaparticles have higher misidentiﬁcation rates as illustrated in Figure 36. To quantify thetypical identiﬁcation performance of the entire calorimeter system, the average identiﬁcationeciency of electrons from the J/ ! e + e decay in B ± ! J/ K ± events is (91 . ± . . ± . L CALO ( e h ) > The primary role of the RICH system is the identiﬁcation of charged hadrons ( ⇡ , K , p ). The information provided is used both at the ﬁnal analysis level, and as part of thesoftware trigger (see Section 5). In addition, the RICH system can contribute to theidentiﬁcation of charged leptons ( e , µ ), complementing information from the calorimeterand muon systems, respectively. One of the primary measures of the RICH performance is ( ✓ C ), the resolution of theCherenkov angle with which the photons, radiated from the particles as they traversethe various radiator volumes, can be reconstructed. The distributions for ✓ C , thedi↵erence between the reconstructed and expected photon Cherenkov angles, are shownin Figure 37 for 2011 data, after all detector alignment and calibration procedures havebeen performed [81]. The expected Cherenkov angles for each track are calculated using49 (cid:48)(cid:88)(cid:82)(cid:81)(cid:3)(cid:80)(cid:82)(cid:80)(cid:72)(cid:81)(cid:87)(cid:88)(cid:80)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:19) (cid:21)(cid:19) (cid:23)(cid:19) (cid:25)(cid:19) (cid:27)(cid:19) (cid:20)(cid:19)(cid:19) (cid:48) (cid:88)(cid:82)(cid:81) (cid:3) (cid:72) (cid:73)(cid:73) (cid:76) (cid:70) (cid:76) (cid:72) (cid:81) (cid:70) (cid:92) (cid:19)(cid:17)(cid:27)(cid:19)(cid:17)(cid:27)(cid:24)(cid:19)(cid:17)(cid:28)(cid:19)(cid:17)(cid:28)(cid:24)(cid:20)(cid:20)(cid:17)(cid:19)(cid:24)(cid:20)(cid:17)(cid:20) (cid:31)(cid:20)(cid:17)(cid:26)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:19)(cid:17)(cid:27)(cid:31)(cid:83) (cid:31)(cid:22)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:20)(cid:17)(cid:26)(cid:31)(cid:83) (cid:31)(cid:24)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:22)(cid:17)(cid:19)(cid:31)(cid:83)(cid:33)(cid:24)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:83) (cid:51)(cid:85)(cid:82)(cid:87)(cid:82)(cid:81)(cid:3)(cid:80)(cid:82)(cid:80)(cid:72)(cid:81)(cid:87)(cid:88)(cid:80)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:19) (cid:21)(cid:19) (cid:23)(cid:19) (cid:25)(cid:19) (cid:27)(cid:19) (cid:20)(cid:19)(cid:19) (cid:51) (cid:85) (cid:82) (cid:87) (cid:82)(cid:81) (cid:3)(cid:3) (cid:80) (cid:76) (cid:86) (cid:16)(cid:44) (cid:39) (cid:3) (cid:72) (cid:73)(cid:73) (cid:76) (cid:70) (cid:76) (cid:72) (cid:81) (cid:70) (cid:92) (cid:19)(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)(cid:19)(cid:17)(cid:19)(cid:20)(cid:19)(cid:17)(cid:19)(cid:20)(cid:24)(cid:19)(cid:17)(cid:19)(cid:21)(cid:19)(cid:17)(cid:19)(cid:21)(cid:24)(cid:19)(cid:17)(cid:19)(cid:22) (cid:31)(cid:19)(cid:17)(cid:27)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:83) (cid:31)(cid:20)(cid:17)(cid:26)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:19)(cid:17)(cid:27)(cid:31)(cid:83) (cid:31)(cid:22)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:20)(cid:17)(cid:26)(cid:31)(cid:83) (cid:31)(cid:24)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:22)(cid:17)(cid:19)(cid:31)(cid:83) (cid:51)(cid:76)(cid:82)(cid:81)(cid:3)(cid:80)(cid:82)(cid:80)(cid:72)(cid:81)(cid:87)(cid:88)(cid:80)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:19) (cid:20)(cid:19) (cid:21)(cid:19) (cid:22)(cid:19) (cid:23)(cid:19) (cid:24)(cid:19) (cid:25)(cid:19) (cid:26)(cid:19) (cid:51) (cid:76) (cid:82)(cid:81) (cid:3) (cid:80) (cid:76) (cid:86) (cid:16)(cid:44) (cid:39) (cid:3) (cid:72) (cid:73)(cid:73) (cid:76) (cid:70) (cid:76) (cid:72) (cid:81) (cid:70) (cid:92) (cid:19)(cid:19)(cid:17)(cid:19)(cid:21)(cid:19)(cid:17)(cid:19)(cid:23)(cid:19)(cid:17)(cid:19)(cid:25)(cid:19)(cid:17)(cid:19)(cid:27)(cid:19)(cid:17)(cid:20)(cid:19)(cid:17)(cid:20)(cid:21)(cid:19)(cid:17)(cid:20)(cid:23)(cid:19)(cid:17)(cid:20)(cid:25)(cid:19)(cid:17)(cid:20)(cid:27)(cid:19)(cid:17)(cid:21) (cid:47)(cid:43)(cid:38)(cid:69)(cid:47)(cid:43)(cid:38)(cid:69)(cid:47)(cid:43)(cid:38)(cid:69) (cid:3)(cid:31)(cid:3)(cid:20)(cid:17)(cid:26)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:19)(cid:17)(cid:27)(cid:3)(cid:31)(cid:3)(cid:83) (cid:3)(cid:31)(cid:3)(cid:22)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:20)(cid:17)(cid:26)(cid:3)(cid:31)(cid:3)(cid:83) (cid:3)(cid:31)(cid:24)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:22)(cid:17)(cid:19)(cid:3)(cid:31)(cid:3)(cid:83) (cid:3)(cid:31)(cid:3)(cid:20)(cid:19)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:24)(cid:17)(cid:19)(cid:3)(cid:31)(cid:3)(cid:83) (cid:46)(cid:68)(cid:82)(cid:81)(cid:3)(cid:80)(cid:82)(cid:80)(cid:72)(cid:81)(cid:87)(cid:88)(cid:80)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:19) (cid:20)(cid:19) (cid:21)(cid:19) (cid:22)(cid:19) (cid:23)(cid:19) (cid:24)(cid:19) (cid:25)(cid:19) (cid:26)(cid:19) (cid:46) (cid:68) (cid:82)(cid:81) (cid:3) (cid:80) (cid:76) (cid:86) (cid:16)(cid:44) (cid:39) (cid:3) (cid:72) (cid:73)(cid:73) (cid:76) (cid:70) (cid:76) (cid:72) (cid:81) (cid:70) (cid:92) (cid:19)(cid:19)(cid:17)(cid:19)(cid:20)(cid:19)(cid:17)(cid:19)(cid:21)(cid:19)(cid:17)(cid:19)(cid:22)(cid:19)(cid:17)(cid:19)(cid:23)(cid:19)(cid:17)(cid:19)(cid:24)(cid:19)(cid:17)(cid:19)(cid:25)(cid:19)(cid:17)(cid:19)(cid:26)(cid:19)(cid:17)(cid:19)(cid:27)(cid:19)(cid:17)(cid:19)(cid:28)(cid:19)(cid:17)(cid:20) (cid:47)(cid:43)(cid:38)(cid:69)(cid:47)(cid:43)(cid:38)(cid:69)(cid:47)(cid:43)(cid:38)(cid:69) (cid:3)(cid:31)(cid:3)(cid:20)(cid:17)(cid:26)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:19)(cid:17)(cid:27)(cid:3)(cid:31)(cid:3)(cid:83) (cid:3)(cid:31)(cid:3)(cid:22)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:20)(cid:17)(cid:26)(cid:3)(cid:31)(cid:3)(cid:83) (cid:3)(cid:31)(cid:24)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:22)(cid:17)(cid:19)(cid:3)(cid:31)(cid:3)(cid:83) (cid:3)(cid:31)(cid:3)(cid:20)(cid:19)(cid:17)(cid:19)(cid:3)(cid:62)(cid:42)(cid:72)(cid:57)(cid:18)(cid:70)(cid:64) (cid:55) (cid:24)(cid:17)(cid:19)(cid:3)(cid:31)(cid:3)(cid:83) Figure 41: Top left: eciency of the muon candidate selection based on the matching of hitsin the muon system to track extrapolation, as a function of momentum for di↵erent p T ranges.Other panels: misidentiﬁcation probability of protons (top right), pions (bottom left), and kaons(bottom right) as muon candidates as a function of momentum, for di↵erent p T ranges. arXiv:1412.6352v2 [hep-ex] 11 Mar 2015 III.B - LHCb PID

Electron momentum [GeV]

0 20 Muon momentum [GeV] E ff i c i e n c y [ % ]

110 105 100 95 90 85 80 E ff i c i e n c y [ % ] Figure 5 Examples of particle reconstruction performance for the LHCb detector. Left: electron reconstruction eﬃciency.Middle: muon reconstruction eﬃciency. Right: Cherenkov angle measurement for diﬀerent particle species in the C F et al. , 2015a). larities, allows them to fully reconstruct ﬁnal states thatcontain π mesons decaying to two photons—present, forinstance, via the copious D → K − π + π decay—as wellas photons, such as those coming from D ∗ → D γ de-cays. At LHCb, the granularity and detector materialchallenges discussed above, as well as the high number of b -hadrons, have led its LFUV measurements to (so far)avoid of the reconstruction of ﬁnal states with π mesonsor photons. C. Kinematic reconstruction: The b -hadron momentum One of the major challenges in the reconstruction ofsemitauonic H b → H c τ ν decays is the determination ofthe parent b -hadron momentum. This momentum is nec-essary to measure important kinematic variables such asthe momentum transfer q = (cid:0) p H b − p H c (cid:1) ≡ (cid:0) p τ + p ν (cid:1) ,that is not directly accessible because of the undetectedneutrinos in the ﬁnal state. In measurements involvingthe τ → (cid:96)νν decay, the momentum of the parent b -hadronis further employed to reconstruct other invariants, such as the invariant mass of the unreconstructed particles m = (cid:0) p H b − p H c − p (cid:96) (cid:1) , (44)or the energy of the charged lepton in the H b rest frame, E ∗ (cid:96) = ( p (cid:96) · p H b ) /m H b . (45)In these leptonic- τ measurements, the signal and nor-malization modes ( H b → H c τ ν and H b → H c (cid:96)ν , respec-tively) are reconstructed in the same exact ﬁnal state, dif-fering only in the number of undetected neutrinos. Sincenormalization events only have one neutrino, their recon-structed m distribution is sharply peaked at zero, incontrast to the broad m distribution of signal events.Additionally, charged leptons in the signal events aregenerated in the secondary τ decay and thus have alower maximum E ∗ (cid:96) than those arising from normaliza-tion H b → H c (cid:96)ν decays.In Sec. III.C.1 we describe how the B factories takeadvantage of their precisely known e + e − beam energiesto determine the momentum of the signal B in a BB event by reconstructing the accompanying tag B . Thisprocedure is not available in the busier hadronic envi-ronment of pp collisions. Instead, LHCb employs the II.C Experimental Methods Table IV Reconstruction eﬃciencies of some of the B tag-ging algorithms employed by the B factories. FEI stands for“Full event interpretation”, FR for “Full reconstruction”, and SER for “Semi-exclusive reconstruction”. The numbers areextracted from (Keck et al. , 2019; Lees et al. , 2013) B tagging Experiment Algorithm B ± B Hadronic Belle II

FEI .

76% 0 . FEI ( FR channels) 0 .

53% 0 . FR .

28% 0 . BABAR

SER .

4% 0 . FEI .

80% 2 . FR .

31% 0 . BABAR

SER .

3% 0 . untagged methods detailed in Secs. III.C.2 and III.C.3.These methods have much higher eﬃciency than B tag-ging, but at the cost of signiﬁcantly worse p H b resolution. B tagging at the B factories As described in Sec. III.A.1, the B factories produce B mesons via e + e − → Υ (4 S ) → BB decays. Since the mo-menta of the colliding electron-positron beams are knownwith high precision, the complete reconstruction of oneof the two B mesons (the tag B or B tag ) can be usedto fully determine the momentum of the other B meson(the signal B or B sig ), simply via p B sig = p e + e − − p B tag .This “tagging” has been implemented by the B facto-ries (Bevan et al. , 2014) in the following ways: • Hadronic B tagging : the B tag is fully reconstructedin ﬁnal states that contain a charm hadron plus anumber of pions and kaons. The full reconstructionof the decay results in the best possible p B sig res-olution (11% as shown in Fig. 6) at the price of alower 0.2–0.8% eﬃciency (Table IV). • Semileptonic B tagging : the B tag is reconstructedin its B tag → D ( ∗ ) (cid:96)ν decays. This leads to ef-ﬁciencies as high as 2% thanks to the large val-ues of the semileptonic branching fractions. Thepresence of an unreconstructed neutrino, however,results in a poor resolution of p B sig . To mitigatethis eﬀect, analyses employing this technique ex-ploit the full reconstruction of the collision eventand require that no unassigned charged or neutralparticles should be present. They further avoid thedirect use of p B sig . • Inclusive B tagging : no attempt is made to ex-plicitly reconstruct the B decay chain. Instead, aspeciﬁc B sig candidate is ﬁrst reconstructed. Thetag side is then reconstructed using all remainingcharged and neutral particles. This leads to a higheﬃciency, but also poor resolution of the tag-sidemomentum. - - true2 q )/ true2 q - reco2 q ( % o f e v e n t s / . at BaBar (11%) hadtag B vertex at LHCb (19%) t RFA at LHCb (22%)

Figure 6 Resolution on the q reconstruction in simulated B → D ∗ τ ν decays for the diﬀerent methods of estimating the p B sig momentum. The values in parentheses correspond to theRMS of each distribution. The various curves are extractedfrom (Aaij et al. , 2015c, 2018b; Lees et al. , 2013). Table IV summarizes the performance of the most ef-ﬁcient algorithms employed by B A B AR , Belle, and BelleII. The Belle II numbers are based on simulations.The hadronic B tagging algorithm of B A B AR is basedon the reconstruction of a charmed seed-state of a B → H c X cascade. Here H c can either be a charmed meson ora J/ψ particle and X is a number of charged and neutralpions or a single kaon. Combinations of seed mesons withdiﬀerent X constituents are selected based on the purityobtained from simulated samples.Belle uses a similar Ansatz, but relies on multivari-ate methods (either neural networks or boosted deci-sion trees) to distinguish correctly-reconstructed versuswrongly-reconstructed tag candidates in a staged ap-proach. Figure 7 illustrates this procedure for the FullEvent Interpretation ( FEI ) algorithm described in (Keck et al. , 2019). This algorithm reconstructs one of the B mesons produced in the collision event using eitherhadronic or semileptonic decay channels. Instead of at-tempting to reconstruct as many B meson decay cas-cades as possible, the FEI algorithm employs a hierarchi-cal reconstruction Ansatz in several stages: at the ini-tial stage, boosted decision trees are trained to identifycharged tracks and neutral energy depositions as detec-tor stable particles ( e + , µ + , K + , π + , K L , γ ). At thefollowing stages, these candidate particles are combinedinto composite particles ( π , K S ) and later heavier me-son candidates ( J/ψ , D , D + , D s ). For each target ﬁnalstate, a boosted decision tree is trained to identify prob-able candidates. The input features are the classiﬁer out-puts of the previous stages, vertex ﬁt probabilities, andthe four-momenta. Similarly candidates for D ∗ , D ∗ + ,and D ∗ s mesons are formed. At the ﬁnal stage, all theinformation of the previous stages is combined to assessthe viability of a B tag candidate. The FR algorithm usesa very similar approach with neural networks. A more II.C Experimental Methods FEI 3 multiplicity decay channels further complicate the re-construction and require tight selection criteria.Semileptonic tagging considers only semileptonic B ! D `⌫ and B ! D ⇤ `⌫ decay channels [3, Section7.4.2]. Due to the presence of a high-momentum leptonthese decay channels can be easily identiﬁed and thesemileptonic tagging usually yields a higher tag-side ef-ﬁciency compared to hadronic tagging due to the largesemileptonic branching fractions. On the other hand,the semileptonic tag will miss kinematic informationdue to the neutrino in the ﬁnal state of the decay.Hence, the sample is not as pure as in the hadroniccase.To conclude, the FEI provides a hadronic and semilep-tonic tag for B ± and B mesons. This enables the mea-surement of exclusive decays with several neutrinos andinclusive decays. In both cases the FEI provides an ex-plicit tag-side decay chain with an associated probabil-ity.

The

FEI algorithm follows a hierarchical approach withsix stages, visualized in Figure 2. Final-state parti-cle candidates are constructed using the reconstructedtracks and clusters, and combined to intermediate par-ticles until the ﬁnal B candidates are formed. The prob-ability of each candidate to be correct is estimated bya multivariate classiﬁer. A multivariate classiﬁer mapsa set of input features (e.g. the four-momentum or thevertex position) to a real-valued output, which can beinterpreted as a probability estimate. The multivariateclassiﬁers are constructed by optimizing a loss-function(e.g. the mis-classiﬁcation rate) on Monte Carlo simu-lated ⌥(4S) events and are described later in detail.All steps in the algorithm are conﬁgurable. There-fore, the decay channels used, the cuts employed, thechoice of the input features, and hyper-parameters ofthe multivariate classiﬁers depend on the conﬁguration.A more detailed description of the algorithm and thedefault conﬁguration can be found in Keck [4] and inthe following we give a brief overview over the key as-pects of the algorithm.2.1 Combination of CandidatesCharged ﬁnal-state particle candidates are created fromtracks assuming diﬀerent particle hypotheses. Neutralﬁnal-state particle candidates are created from clus-ters and displaced vertices constructed by oppositelycharged tracks. Each candidate can be correct (sig-nal) or wrong (background). For instance, a track used Tracks Displaced Vertices Neutral Clusters ⇡ K K ⇡ + e + µ + K + D ⇤ D ⇤ + D ⇤ s B B + D D + D s J / K Fig. 2: Schematic overview of the

FEI . The algorithmoperates on objects identiﬁed by the reconstructionsoftware of the Belle II detectors: charged tracks, neu-tral clusters and displaced vertices. In six distinctstages, these basics objects are interpreted as ﬁnal-stateparticles ( e + , µ + , K + , ⇡ + , K , ) combined to form in-termediate particles ( J / , ⇡ , K , D , D ⇤ ) and ﬁnallyform the tag-side B mesons.to create a ⇡ + candidate can originate from a piontraversing the detector (signal), from a kaon traversingthe detector (background) or originates from a randomcombination of hits from beam-background (also back-ground).All candidates available at this stage are combinedto intermediate particle candidates in the subsequentstages, until candidates for the desired B mesons arecreated. Each intermediate particle has multiple possi-ble decay channels, which can be used to create validcandidates. For instance, a B candidate can be createdby combining a D and a ⇡ candidate, or by combin-ing a D , a ⇡ and a ⇡ candidate. The D candidatecould be created from a K and a ⇡ + , or from a K and a ⇡ .The FEI reconstructs more than explicit decaychannels, leading to O (10000) distinct decay chains.2.2 Multivariate ClassiﬁcationThe FEI employs multivariate classiﬁers to estimate theprobability of each candidate to be correct, which canbe used to discriminate correctly identiﬁed candidatesfrom background. For each ﬁnal-state particle and foreach decay channel of an intermediate particle, a mul-tivariate classiﬁer is trained which estimates the signalprobability that the candidate is correct. In order touse all available information at each stage, a network

Figure 7 Schematic illustration of the

FEI algorithm. From(Keck et al. , 2019). detailed description can be found in (Feindt et al. , 2011).The performance of the FEI algorithm on early Belle IIdata is discussed in (Abudin´en et al. , 2020).In the future deep learning or graph-based networkapproaches might allow further increases in the re-construction eﬃciency of algorithms like the FEI atBelle II (Boeckh, 2020; Keck, 2017). τ → π − π + π − ν vertex reconstruction at LHCb At the LHC, the energies of the partons whose col-lisions produce the bb pairs are not known, so B tag-ging cannot be employed. However, by taking advan-tage of the excellent vertexing capabilities of LHCb,in the case that the τ lepton decays to at least threecharged particles, the momentum of the parent b -hadronin H b → H c,u τ ν events can still be precisely determinedup to a discrete ambiguity. This procedure was estab-lished in 2018 by the hadronic- τ measurement of R ( D ∗ + )with τ → π + π + π − ν (Aaij et al. , 2018b). In general, about 100 tracks arise from a primary ver-tex (PV) within a pp collision at LHCb, such that the lo-cation of this vertex can be measured to an excellent pre-cision of around 10 µ m. In B → D ∗ + τ − ν τ events withthe D ∗ + meson decaying promptly via the D ∗ + → D π + strong decay, the D vertex can be reconstructed as theintersection of its kaon and pion daughters with a 150 µ mprecision along the z direction (see Fig. 8 top). The ver-tex for the τ → π − π + π − ν decay can be measured to a200 µ m precision. Due to the very small angle betweenthe directions of the bachelor pion produced in the D ∗ + decay and the reconstructed D , their intersection haspoor precision and is not used in the determination of The channel τ → π − π + π − ν always includes contributions fromthe τ → π − π + π − π ( π ) ν channels, unless speciﬁed otherwise. III.C - LHCb topologies Table XI Relative uncertainties in percent for the muonic R ( D ⇤ ) measurement by LHCb.Contribution Uncertainty [%]Simulated sample size 6.2Misidentiﬁed µ bkg. 4.8 B ! D ⇤⇤ l ⌫ bkg. 2.1Signal/norm. FFs 1.9Hardware trigger 1.8 DD bkg. 1.5MC/data correction 1.2Combinatorial bkg. 0.9PID 0.9Total systematic 8.9Total statistical 8.0Total 12.0 B → D * − τ + ν τ π − π + π + ν τ D B π − p PV p B → D * − τ + ν τ π − K + τ + Δ z > σ Δ z ν τ Figure 14 Topology of the signal decay. A requirement onthe distance between the 3 ⇡ and the B vertices along thebeam direction to be greater than four times its uncertaintyis applied. is initially very large. However, this background can be reduced by four orders of magnitude using the fact that, due to the ﬁnite ⌧ lifetime, the 3 ⇡ vertex will lie down- stream of the B vertex, in contrast with the typical topol- ogy where the 3 ⇡ vertex sits at the B vertex. This dis- tinctive detached topology is illustrated on Fig. 14 . The remaining background will consists of B decays to double charm which when one of the charm particles decays to ⇡ has the same topology. Figure 15 shows the distribu- tion of the detachment signiﬁcance z / z for the three event categories. The experimental challenge consists therefore in the precise measurement of the position of these vertices. This is ideally done at the LHCb where B hadrons are produced with a large boost, around 40, lead- ing to extremely clean separation of the secondary from primary vertex, and of the tertiary vertex from secondary vertex. The primary vertex reconstruction is based on the reconstruction of about 100 tracks from the p-p inter- action and its locations is therefore known to an excellent precision around 10 µ C. The 3 ⇡ vertex of a ⌧ decay is known to about 150 µ C along the z-direction, and the B vertex, deﬁned as the intersection of the D ⇤ and ⇡ line of ﬂight to a similar precision. The key variable , z/ ⌃ z , provides therefore an extremely clean separation between the majority of the B decays where the 3 ⇡ tracks are produced at the B vertex (called prompt 3 ⇡ events here- after), and those coming from double charm of ⌧ decays where the ⇡ are detached from the beam (Fig. 15). In order to obtain the maximum rejection against prompt ⇡ events, it is necessary to reject the various sources that can fake a detached 3 ⇡ vertex: presence of a un- correlated vertex in the beam pipe due to beam gas or di↵ractive event, or in the beam pipe or at larger ra- dius due to interaction in the material, events where the D ⇤ and the three- ⇡ system are attached to two di↵er- ent primary vertices. To reject fake detached vertices, where the D ⇤ and the 3 ⇡ come from the two di↵erent B-hadrons present in the event, strict charge isolation is required and candidates are kept only if there is only one candidate per event. In addition, it is required that the D ⇤ ⇡ system points back to its primary vertex within

20 mrad.

After these selection requirements, the resulting 3 ⇡ mass spectrum (Fig.16 exhibits some distinctive features, a very clean D + s peak , a smaller D + signal, a very small tail above the D + s mass indicating the small level of com- binatoric events, and a signiﬁcant drop above 1.4 GeV /c , due to the end of phase space for the decays D ! K3 ⇡ , which can be used to control the D and D + components. The number of candidates coming from D + s decays is about 30 times larger than the observed exclusive decays of D + s in exactly 3 ⇡ . D + s decays proceed mainly to ⌘ , ⌘ , ! and mesons, as spectators and a ⇡ , a ⇢ or a a at the virtual W vertex. The 3 ⇡ ﬁnal state is therefore very common and represents about 30% of the D + s decays, two-thirds with only 3 ⇡ ,and one-third with 5 ⇡ . Unfortu- nately, the branching fractions of many of these modes are not yet measured or not measured with a good pre- cision. The BES-III experiment ((Ablikim et al. , 2010)) can perform all these measurements quite well taking ad- vantage of the e + e ! D + s D ⇤⌥ s channel, taking data at this threshold. The min ( m ⇡ + ⇡ ) distribution (Fig.17) z Δ σ z/ Δ − − C a nd i d a t e s / . → LHCb simulation ) X πππ * D Prompt ( ) DX * D Double-charm () ντ * D Signal (

Figure 15 Distribution of the distance between the B ver-tex and the 3 ⇡ vertex along the beam direction, divided byits uncertainty, obtained using simulation. The vertical lineshows the 4 requirement used in the analysis to reject theprompt background component. p p D *+ ¯ B ¯ ν τ τ − μ − ¯ ν μ α π + π + K − D ν τ zy PV Muonic- τ ℛ( D *+ ) p p ¯ B PV ν τ π − π + π − Hadronic- τ ℛ( D *+ ) D *+ ¯ ν τ τ − π + π + K − D Δ z Figure 8 Reconstructed topologies for the B → D ∗ τ ν decaysin the hadronic- τ (top) and muonic- τ (bottom) measurementsof R ( D ∗ + ) at LHCb (Aaij et al. , 2015c, 2018b). The ﬁlledcircles correspond to the reconstructed vertices, and solid linesto reconstructed particles. “PV” refers to the primary vertex,∆ z the distance in the z -direction between the B (or D ∗ + )and the τ − vertices, and α to the angle between the beamaxis and the momemtum of the B meson. the position of the B vertex. Instead, this position isestimated with a ∼ D ∗ + and τ trajectories, where the τ line of ﬂight isapproximated by the π − π + π − direction. Thanks to thelarge boost of b -hadrons at LHCb, βγ ∼

50, these threevertices are well separated and determine the directionsof ﬂight of the B meson and τ lepton momenta—theunit vectors ˆ p B and ˆ p τ , respectively—with fairly goodprecision.With ˆ p τ known and the π − π + π − hadronic state fullyreconstructed, the τ energy can be determined up toa two-fold ambiguity, arising from the solution of thequadratic relation ( p τ − p πππ ) = 0. This result, whenfurther combined with ˆ p B and the full reconstruction ofthe D ∗ + , in turn allows the determination of the B mo-mentum up to a four-fold ambiguity from the quadratic( p B − p D ∗ − p τ ) = 0. The resulting overall q resolutionis around 19%.

3. Rest frame approximation with τ → µνν at LHCb It is not possible to reconstruct the τ vertex whenthe τ lepton is identiﬁed by its 1-prong τ → µνν de-cay (Fig. 8 bottom). Thus, semitauonic measurementsat LHCb that make use of this decay mode estimate themomentum of the b -hadron via the rest frame approx-imation (RFA) instead. This procedure assumes thatthe proper velocity of the H b hadron along the z -axis—the beam axis—is the same for as for the reconstructedcharm-muon system, µH c . This leads to the relation-ship ( p H b ) z /m H b = ( p µH c ) z /m µH c . Since the direction of V. Experimental Tests of Lepton Flavor Universality b -hadron can be determined by the displace-ment of the H b decay vertex from the primary vertex,the H b momentum can then be estimated via | p H b | = m H b m µH c ( p µH c ) z (cid:112) α , (46)where α is the angle between the H b direction of ﬂightand the z axis, as shown in Fig. 8.In the highly boosted regime of LHCb, the RFA isa fairly good approximation that leads to an adequateoverall q resolution of about 22% (see Fig. 6), albeitwith a long tail on the positive side and some bias. It isworth noting that this resolution is highly q -dependent,varying between 34% for q < to 7% at q > .In general, semitauonic measurements at LHCb thatmake use of the hadronic- τ reconstruction (Aaij et al. ,2018b) will have better precision for the reconstruc-tion of kinematic distributions than muonic- τ measure-ments (Aaij et al. , 2015c). In contrast, the latter mayhave a better ultimate precision in the determination ofthe ratios R ( H c ) because they do not depend on exter-nal branching fractions in the normalization of the signal H b → H c τ ν decays, such as those used in Eq. (53) below.In the future, LHCb may be able to improve the preci-sion on the b -hadron momentum reconstructrion by tak-ing advantage of the large samples of b -hadrons that willbe collected over the next decade and a half. For in-stance, the reconstruction of B + mesons arising from B ∗ s → B + K − decays allows for a higher-precision de-termination of the B + kinematics by constraining theinvariant mass of the B + K − system to the known B ∗ s mass, but it comes at the price of a less than 1% re-construction eﬃciency. This technique has already beensuccessfully employed to reconstruct B − → D ( ∗ , ∗∗ )0 µ − ¯ ν µ decays (Aaij et al. , 2019b), and could be in the future ap-plied to semitauonic decays as well. IV. EXPERIMENTAL TESTS OF LEPTON FLAVORUNIVERSALITY

The decay B → D ∗ τ ν was ﬁrst observed in 2007 bythe Belle collaboration (Matyja et al. , 2007), and sub-sequent measurements by B A B AR (Aubert et al. , 2008)and Belle (Adachi et al. , 2009; Bozek et al. , 2010) foundevidence for B → Dτ ν decays as well. These measure-ments all saw values of R ( D ( ∗ ) ) that exceeded the SMexpectations, but the signiﬁcance of these excesses waslow due to the large uncertainties involved in these earlyresults: above 20% for R ( D ∗ ) and over 30% for R ( D ).All of these measurements have now been superseded, sothey will not be further discussed in this review.The ﬁrst evidence for an excess of B → D ( ∗ ) τ ν decayswas reported by B A B AR in 2012 (Lees et al. , 2012), ameasurement that also included the ﬁrst observation of Table V Summary of the diﬀerent results covered by this re-view, classiﬁed by the measured observable and the deployedmethod. The references for each experiment are given at thebottom of the table; the relevant sections of this review areprovided below each result.

Obs. Method

Hadronic tag Semilep. tag Untagged R ( D ) 0 . Ba12 . B20

IV.A.1 IV.B.10 . B15a

IV.A.1 R ( D ∗ ) 0 . Ba12 . B16b . L15

IV.A.1 IV.B.1 IV.C.10 . B15a . B20 . L18b

IV.A.1 IV.B.1 IV.C.20 . (+28)( − IV.A.1 P τ ( D ∗ ) − . (21)(16)B17 IV.D.1 F L,τ ( D ∗ ) 0 . B19

IV.D.2 R ( J/ψ ) 0 . L18a

IV.C.3 R ( π ) 1 . B16a

IV.A.2

Ba12

BABAR (Lees et al. , 2012, 2013), with ρ = − . B15a

Belle (Huschle et al. , 2015), with ρ = − . B16a

Belle (Hamer et al. , 2016), when combined with world-averagedBr( B → π(cid:96)ν ). L15

LHCb (Aaij et al. , 2015c).

B16b

Belle (Sato et al. , 2016).

B17

Belle (Hirose et al. , 2017, 2018), with single-prong τ hadronic decays. L18a

LHCb (Aaij et al. , 2018a).

L18b

LHCb (Aaij et al. , 2018b), with τ → π + π + π − ν updated taking intoaccount the latest HFLAV average of B ( B → D ∗ + (cid:96)ν ) = 5 . ± . ± . B19

Belle (Abdesselam et al. , 2019), using inclusive tagging.

B20

Belle (Caria et al. , 2020), with ρ = − . B → Dτ ν decays. Similar excesses have been reportedsince by the Belle (Caria et al. , 2020; Hirose et al. , 2018;Huschle et al. , 2015; Sato et al. , 2016) and LHCb exper-iments (Aaij et al. , 2015c, 2018b), and the ﬁrst measure-ments of the polarization of some of the decay productshave been reported by Belle (Abdesselam et al. , 2019;Sato et al. , 2016) as well. The persistent nature of theseanomalies has spurred wide interest in semitauonic de-cays and, as a result, other channels that proceed via b → uτ ν or diﬀerent b → cτ ν transitions are being stud-ied. Two such results have been published so far: Belle’ssearch for B → πτ ν decays (Hamer et al. , 2016) andLHCb’s measurement of R ( J/ψ ) (Aaij et al. , 2018a).In this section we describe the key features of all ofthese measurements regarding their event selection, back-ground determination, main uncertainties, and signal ex-traction. The following subsections group the various re-sults according to their b -hadron tagging method which,as we saw in Sec. III.C, can be employed to determine themomentum of the parent b -hadron and has a substantial V.A Experimental Tests of Lepton Flavor Universality R ( D ( ∗ ) ) results and comparisons of all the observableswith their respective SM predictions.There exist, in addition, measurements of the inclusive B → X c τ ν rate, that we will not cover in this review.These comprise LEP measurements of b → Xτ ν (Ab-biendi et al. , 2001; Abreu et al. , 2000; Acciarri et al. ,1994, 1996; Barate et al. , 2001), that require assump-tions about cancellation of hadronization eﬀects in orderto be interpreted as B → X τ ν measurements, and arecent result (Hasenbusch, 2018) that is unpublished. A. B -factory measurements with hadronic tags This section describes some of the most recent semi-tauonic results involving hadronic B tags: the mea-surements of B → D ( ∗ ) τ ν decays by B A B AR (Lees et al. , 2012, 2013) and Belle (Huschle et al. , 2015) inSec. IV.A.1, as well as a 2015 search for B → πτ ν de-cays by Belle (Hamer et al. , 2016) in Sec. IV.A.2. Anadditional measurement of B → D ( ∗ ) τ ν decays by Belleinvolving hadronic tags focused on the polarization of the τ lepton (Hirose et al. , 2017, 2018), and is described inSec. IV.D. R ( D ( ∗ ) ) with τ → (cid:96)νν The B A B AR experiment published the ﬁrst high-precision measurement of R ( D ( ∗ ) ) based on their fulldataset of 471 × BB pairs in 2012 (Lees et al. , 2012,2013). The Belle experiment followed in 2015 with ananalysis of their 772 × BB pair dataset (Huschle et al. ,2015), employing a similar strategy. In both cases, signal B → D ( ∗ ) τ ν and normalization B → D ( ∗ ) (cid:96)ν decays areselected by the same particles in the ﬁnal state: a D or D ∗ meson, and a charged light lepton (cid:96) = e or µ . In thecase of signal events, the light lepton (cid:96) comes from thesecondary τ → (cid:96)νν decay, which leads to two additionalneutrinos in the ﬁnal state and a typically lower leptonmomentum. The D mesons are reconstructed by combi-nations of K + , K S , π + , and π mesons with invariantmasses close to the nominal D and D + masses, covering25–35% of the total D branching fractions. The heavier D ∗ mesons are identiﬁed by the D ∗ + → D π + , D + π and D ∗ → D π , D γ decays.In order to separate signal from normalization decaysas well as to reduce background contributions, the event is also required to have a fully reconstructed hadronic B tag and no additional tracks; see Sec. III.C.1. As de-scribed there, the reconstruction eﬃciency of the B tag is only ≈ . B ,which in turn is used to calculate the momentum transfer q = ( p B sig − p D ( ∗ ) ) and the missing momentum of theunreconstructed neutrinos p miss = p B sig − p D ( ∗ ) − p (cid:96) = p e + e − − p B tag − p D ( ∗ ) − p (cid:96) . The invariant missing mass m = p peaks at zero for the one-neutrino normal-ization events, but has a broad distribution at positivevalues for signal events with three neutrinos in the ﬁnalstate.A key variable to further reduce background contri-butions is E ECL : the sum of the energy deposits in thecalorimeter which are not associated with the tag or sig-nal B decays. Events involving signal and normalizationdecays have all their visible ﬁnal state particles recon-structed, but background decays to D ∗∗ mesons (amongothers) can enter the signal selection when their daugh-ter π mesons or photons are unassigned. Both B A B AR and Belle feed E ECL to multivariate clasiﬁers that aretrained to reject these background contributions. In thecase of B A B AR , the output of the classiﬁer, a boosteddecision tree, is required to have a minimum value forthe event to be selected. As we describe below, Belle ﬁtsthe output distributions of the classiﬁer (from a neuralnetwork) directly. Finally, only events with q > are selected, a requirement that takes advantage of themomentum transfer of signal events being kinematicallyconstrained to lie above m τ = 3 .

16 GeV .The number of signal, normalization, and backgroundevents in each of the D (cid:96) , D + (cid:96) , D ∗ (cid:96) , and D ∗ + (cid:96) datasamples is determined by maximum likelihood ﬁts tothe observed data distributions. The ratios of yields forthe isospin-related contributions—e.g., D (cid:96) versus D + (cid:96) or D ∗ (cid:96) versus D ∗ + (cid:96) —are constrained by the knownbranching fractions and simulated relative eﬃciencies. B A B AR employs an additional ﬁt without these con-straints that checks the consistency with the expectedpercent-level degree of isospin breaking. The probabil-ity distribution functions (PDFs) that describe each ofthe contributions are taken from Monte Carlo simula-tions that make use of the CLN form factor parametriza-tion (Sec. II.C.2) for the signal and normalization modes,LLSW (Leibovich et al. , 1997) for B → D ∗∗ lν decays, and other (phase-space based) models augmented withcorrections from data control samples for the rest of thebackground contributions. Additional assumptions onthe D ∗∗ branching fractions are described in Sec. V.C.2.The B A B AR analysis employs a two-dimensional ﬁtto the m and the charged lepton energy in the B As a reminder, throughout this review l stands for e , µ , or τ ,and (cid:96) for e or µ . V.A Experimental Tests of Lepton Flavor Universality VI. FIT PROCEDURE

As explained above, the low- M miss region is dominatedby the lepton normalization and has essentially no sensi-tivity to the tau signal; in contrast, the high- M miss region,where the tau signal is concentrated, exhibits little dis-crimination power in M miss between the tau signal and theother backgrounds — in particular, the D !! background.Therefore, we fit simultaneously the M miss distributionbelow . GeV =c to constrain the lepton normalizationand lepton cross-feed yields and a neural-network output o NB above . GeV =c to constrain the yields of theother components. (In fact, all components are fit in bothregions.) The partition at M miss ¼ . GeV =c mini-mizes the expected uncertainty on R ð D Þ and R ð D ! Þ . The aforementioned neural network is trained for each ofthe four data samples with simulated events to distinguishthe tau signal from the backgrounds in the high- M miss region: mainly D !! background but also the wrong-chargecross-feed, fake lepton, D s decay, and rest components.The neural network incorporates M miss and several otherobservables that provide the desired signal-to-backgroundseparation. The most powerful observable is E ECL , theunassociated energy in the ECL that aggregates all clustersthat are not associated with reconstructed particles (includ-ing bremsstrahlung). A nonzero E ECL value indicates amissing physical process in the event, such as a decay modewith a π in which only a single daughter photon isreconstructed. Two additional network inputs are q and p ! l ; their additional discriminating power is limited by theirstrong correlation with M miss . Other input variables, whichprovide marginally more discrimination, are the number ofunassigned π candidates with j S γγ j < . ; the cosine of theangle between the momentum and vertex displacement ofthe D ð!Þ meson; and the decay-channel identifiers of the B and D ð!Þ mesons.For use in the fit, the neural-network output o NB istransformed into TABLE II. Yields for the fixed components in the four datasamples. D þ l − D l − D !þ l − D ! l − Fake D ð!Þ

350 1330 180 2220Fake l D s decay 22.0 112 21.0 20.7Rest 23.6 77 4.3 4.2 ) /c (GeV M0.2 − E v en t s ντ D* → B ντ D → B ν D*l → B ν Dl → Bother BG ν D**l → B ' NB o8 − − − − E v en t s /c (GeV M0.2 − E v en t s NB o8 − − − − E v en t s FIG. 1 (color online). Fit projections and data points with statistical uncertainties in the D þ l − (top) and D l − (bottom) data samples.Left: M miss distribution for M miss < . GeV =c ; right: o NB distribution for M miss > . GeV =c .MEASUREMENT OF THE BRANCHING RATIO OF … PHYSICAL REVIEW D o NB ≡ log o NB − o min o max − o NB ; ð Þ where the parameters o min and o max are the minimum andmaximum network output values, respectively, in theelected data sample. The o NB distributions have smoothershapes and can be described well with bifurcated Gaussianfunctions, which makes their parameterizations morerobust.For each fit component within a selected data sample,two PDFs are determined: in M miss for M miss < . GeV =c and in o NB for M miss > . GeV =c .The PDFs of M miss are represented by smoothed histogramsand are constructed by applying a smoothing algorithm[30] to the respective MC distributions. Each bifurcated-Gaussian PDF in o NB is parameterized by the mean, leftwidth and right width, which are determined by anunbinned maximum likelihood fit to the MC distribution.In the fit, each component has a total yield, defined inTable I, with partial yields in the lower- and upper- M miss regions that are fixed MC-determined fractions of thetotal yield. We maximize the extended likelihood function L ¼ Y i ! Q ð N i ;K i Þ Y K i k i ¼ P i ð x k i Þ " ; ð Þ where i ∈ f D þ l − ;D l − ;D %þ l − ;D % l − g is the data-sample index, Q ð N i ;K i Þ is the Poisson probability toobserve K i events for an expectation value of N i ¼ P j Y i;j events (with Y i;j being the yield of component j in data sample i ), and the vector x k i holds the values for M miss and o NB of candidate k i . The PDF P i of data sample i is given by P i ð M miss ;o NB Þ ¼ N i · X j Y i;j ½ f i;j; low P i;j; low ð M miss Þþ ð − f i;j; low Þ P i;j; high ð o NB Þ’ : ð Þ The index j runs over the components and f i;j; low is thefraction of events of the component j that are in the lower M miss range. The one-dimensional probability densityfunction P i;j; low ( P i;j; high ) represents the M miss ( o NB ) dis-tribution in the low- (high-) M miss region. ) /c (GeV M0.2 − E v en t s ντ D* → B ν D*l → Bother BG ν D**l → B ' NB o8 − − − − E v en t s /c (GeV M0.2 − E v en t s NB o8 − − − − E v en t s FIG.2(color online). Fitprojections anddatapointswithstatistical uncertaintiesinthe D %þ l − (top)and D % l − (bottom)datasamples.Left: M miss distribution for M miss < . GeV =c ; right: o NB distribution for M miss > . GeV =c .M. HUSCHLE et al. PHYSICAL REVIEW D o NB ≡ log o NB − o min o max − o NB ; ð Þ where the parameters o min and o max are the minimum andmaximum network output values, respectively, in theelected data sample. The o NB distributions have smoothershapes and can be described well with bifurcated Gaussianfunctions, which makes their parameterizations morerobust.For each fit component within a selected data sample,two PDFs are determined: in M miss for M miss < . GeV =c and in o NB for M miss > . GeV =c .The PDFs of M miss are represented by smoothed histogramsand are constructed by applying a smoothing algorithm[30] to the respective MC distributions. Each bifurcated-Gaussian PDF in o NB is parameterized by the mean, leftwidth and right width, which are determined by anunbinned maximum likelihood fit to the MC distribution.In the fit, each component has a total yield, defined inTable I, with partial yields in the lower- and upper- M miss regions that are fixed MC-determined fractions of thetotal yield. We maximize the extended likelihood function L ¼ Y i ! Q ð N i ; K i Þ Y K i k i ¼ P i ð x k i Þ " ; ð Þ where i ∈ f D þ l − ; D l − ; D %þ l − ; D % l − g is the data-sample index, Q ð N i ; K i Þ is the Poisson probability toobserve K i events for an expectation value of N i ¼ P j Y i;j events (with Y i;j being the yield of component j in data sample i ), and the vector x k i holds the values for M miss and o NB of candidate k i . The PDF P i of data sample i is given by P i ð M miss ; o NB Þ ¼ N i · X j Y i;j ½ f i;j; low P i;j; low ð M miss Þþ ð − f i;j; low Þ P i;j; high ð o NB Þ’ : ð Þ The index j runs over the components and f i;j; low is thefraction of events of the component j that are in the lower M miss range. The one-dimensional probability densityfunction P i;j; low ( P i;j; high ) represents the M miss ( o NB ) dis-tribution in the low- (high-) M miss region. ) /c (GeV M0.2 − E v en t s ντ D* → B ν D*l → Bother BG ν D**l → B ' NB o8 − − − − E v en t s /c (GeV M0.2 − E v en t s NB o8 − − − − E v en t s FIG. 2 (color online). Fit projections and data points with statistical uncertainties in the D %þ l − (top) and D % l − (bottom) data samples.Left: M miss distribution for M miss < . GeV =c ; right: o NB distribution for M miss > . GeV =c .M. HUSCHLE et al. PHYSICAL REVIEW D

Belle (e)(f) D + ℓD *+ ℓ E v e n t s / ( . ) The simultaneous fit over all four data samples hastwelve free parameters: the lepton normalization yield persample, the lepton cross-feed yield per D l − sample, the D !! background yield per sample, and the branching-fraction ratios R ð D Þ and R ð D ! Þ . Here, we assume isospinsymmetry and use the same R ð D Þ and R ð D ! Þ parametersfor the ¯ B and B − samples. VII. CROSS-CHECKS

The implementation of the fit procedure is tested byapplying the same procedure to multiple subsets of theavailable simulated data. The fit accuracies are evaluatedusing sets of 500 pseudoexperiments and show no signifi-cant bias in any measured quantity. These are used also totest the influence on the fit result of the value of M miss ¼ . GeV =c that is used to partition the samples:variation of this value reduces the precision of the fit resultbut does not introduce any bias.Further tests address the compatibility of the simulatedand recorded data. To test resolution modelling, we use asample of events with q < . GeV =c , dominated by ¯ B → D ð!Þ l − ¯ ν l decays. As the D !! background is one of themost important components — with a large potential for flaws in its modeling — we evaluate its distributions in moredepth by reconstructing a data sample with enriched ¯ B → D !! l − ¯ ν l content by requiring a signal-like event but withan additional π . The background-enriched data samplesare fit individually in four dimensions separately: M miss , M miss ; no π , E ECL , and p ! l , where M miss ; no π is the missingmass of the candidate, calculated without the additional π .The shapes of the components are extracted from simulateddata. In each of the four D ð!Þ l − π samples, consistentyields are obtained from the fits to all four variables,indicating that the simulation describes faithfully thedistribution in all tested dimensions. VIII. RESULTS

The fit to the entire data sample gives R ð D Þ ¼ . % . ð Þ R ð D ! Þ ¼ . % . ; ð Þ corresponding to a yield of 320 ¯ B → D τ − ¯ ν τ and 503 ¯ B → D ! τ − ¯ ν τ events; the errors are statistical. Projections of thefit are shown in Figs. 1 and 2. The high- M miss distributions ) /c (GeV M1 2 3 4 5 6 7 8 E v en t s ντ D* → B ντ D → B ν D*l → B ν Dl → Bother BG ν D**l → B ) /c (GeV M1 2 3 4 5 6 7 8 E v en t s /c (GeV M1 2 3 4 5 6 7 8 E v en t s /c (GeV M1 2 3 4 5 6 7 8 E v en t s FIG. 3 (color online). Projections of the fit results and data points with statistical uncertainties for the high M miss region. Top left: D þ l − ; top right: D !þ l − ; bottom left: D l − ; bottom right: D ! l − .MEASUREMENT OF THE BRANCHING RATIO OF … PHYSICAL REVIEW D o NB ≡ log o NB − o min o max − o NB ; ð Þ where the parameters o min and o max are the minimum andmaximum network output values, respectively, in theelected data sample. The o NB distributions have smoothershapes and can be described well with bifurcated Gaussianfunctions, which makes their parameterizations morerobust.For each fit component within a selected data sample,two PDFs are determined: in M miss for M miss < . GeV =c and in o NB for M miss > . GeV =c .The PDFs of M miss are represented by smoothed histogramsand are constructed by applying a smoothing algorithm[30] to the respective MC distributions. Each bifurcated-Gaussian PDF in o NB is parameterized by the mean, leftwidth and right width, which are determined by anunbinned maximum likelihood fit to the MC distribution.In the fit, each component has a total yield, defined inTable I, with partial yields in the lower- and upper- M miss regions that are fixed MC-determined fractions of thetotal yield. We maximize the extended likelihood function L ¼ Y i ! Q ð N i ; K i Þ Y K i k i ¼ P i ð x k i Þ " ; ð Þ where i ∈ f D þ l − ; D l − ; D %þ l − ; D % l − g is the data-sample index, Q ð N i ; K i Þ is the Poisson probability toobserve K i events for an expectation value of N i ¼ P j Y i;j events (with Y i;j being the yield of component j in data sample i ), and the vector x k i holds the values for M miss and o NB of candidate k i . The PDF P i of data sample i is given by P i ð M miss ; o NB Þ ¼ N i · X j Y i;j ½ f i;j; low P i;j; low ð M miss Þþ ð − f i;j; low Þ P i;j; high ð o NB Þ’ : ð Þ The index j runs over the components and f i;j; low is thefraction of events of the component j that are in the lower M miss range. The one-dimensional probability densityfunction P i;j; low ( P i;j; high ) represents the M miss ( o NB ) dis-tribution in the low- (high-) M miss region. ) /c (GeV M0.2 − E v en t s ντ D* → B ν D*l → Bother BG ν D**l → B ' NB o8 − − − − E v en t s /c (GeV M0.2 − E v en t s NB o8 − − − − E v en t s FIG. 2 (color online). Fit projections and data points with statistical uncertainties in the D %þ l − (top) and D % l − (bottom) data samples.Left: M miss distribution for M miss < . GeV =c ; right: o NB distribution for M miss > . GeV =c .M. HUSCHLE et al. PHYSICAL REVIEW D

Belle (g)(h) D ℓD *0 ℓ E v e n t s / ( . ) (b) (d) IV.B - Hadronic tag fits (c)

BaBar BaBar

Figure 9 Projections of the signal ﬁts for the

BABAR (Lees et al. , 2012) and Belle (Huschle et al. , 2015) measurements of R ( D ( ∗ ) ) with hadronic tagging. (a-b) Full m projections of the BABAR ﬁt showing the normalization components forthe

D(cid:96) and D ∗ (cid:96) samples (combination of D ( ∗ )0 (cid:96) and D ( ∗ )+ (cid:96) ). (c-d) m projections of the BABAR ﬁt focusing on thesignal contributions at high m . (e-h) Full projections of the ﬁt to the neural network output o (cid:48) NB by Belle in the region m > .

85 GeV for the four D ( ∗ ) (cid:96) samples.Table VI Comparison of the total yields extracted by theisospin-constrained ﬁts from BABAR (Lees et al. , 2012) andBelle (Huschle, 2015). The “ (cid:15) ratio” column corresponds tothe ratio of the Belle to the

BABAR ﬁtted yields normalizedby the datasets, 471 million of BB pairs for BABAR and 772million for Belle.Sample Contribution

BABAR

Belle (cid:15) ratio

D(cid:96) B → Dτ ν

489 320 0.40 B → D(cid:96)ν B → D ∗∗ lν

506 239 0.29Other bkg. 1033 2005 1.18 D ∗ (cid:96) B → D ∗ τ ν

888 503 0.35 B → D ∗ (cid:96)ν B → D ∗∗ lν

261 153 0.36Other bkg. 404 2477 3.74 rest frame, E ∗ (cid:96) , while Belle ﬁts the m distribution for m < .

85 GeV and the output of the classiﬁer athigh m . Figure 9 shows some of the relevant pro-jections for both ﬁts. The narrow peaks in Fig. 9(a-b),including that of the feed-down B → D ∗ (cid:96)ν decays recon-structed in the D(cid:96) sample with a broader m distri-bution, illustrate the power of hadronic tagging in dis-criminating signal from normalization decays. Table VIshows a comparison of their ﬁtted yields. Although theBelle dataset is 64% larger, the signal yields are about40% smaller due to the lower reconstruction eﬃciency.The diﬀerences in the background yields are primarilydue to B A B AR placing a requirement on the multivariateclassiﬁer and Belle ﬁtting its output instead.The most challenging background contribution arisesfrom B → D ∗∗ (cid:96)ν and B → D ∗∗ τ ν decays. The B → D ∗∗ (cid:96)ν processes are estimated in control samples withthe same selection as the signal samples, except for theaddition of a π meson. In these control samples, decaysof the form B → D ( ∗ ) π (cid:96) − ν (cid:96) have values of m close tozero, so that their yields are easily determined with ﬁts tothis variable. This ﬁt is performed simultaneously withthe ﬁts to the signal samples, and the B → D ∗∗ lν con-tribution to both is linked by the ratio of expected yieldstaken from the simulation. Additional backgrounds fromcontinuum and combinatorial B processes are estimatedfrom data control samples, and are ﬁxed in the ﬁts.Table VII summarizes all the sources of uncertaintyin the measured R ( D ( ∗ ) ) ratios by both analyses. Thelargest uncertainties come from the B → D ∗∗ lν contri-butions and the limited size of the simulated samples(“MC stats”). The latter uncertainty aﬀects primarilythe PDFs describing the kinematic distributions of allthe components in the ﬁt. The branching fraction ratiosare calculated as R ( D ( ∗ ) ) = N sig N norm (cid:15) norm (cid:15) sig , (47)where N sig and N norm are the number of signal and nor-malization events determined by the ﬁt, respectively, and (cid:15) sig /(cid:15) norm is the ratio of eﬃciencies taken from simula-tion. Since the signal and normalization decays are re-constructed with the same particles in the ﬁnal state,many uncertainties cancel in the ratio leading to a rela-tively small 2–3% overall uncertainty on this quantity.Table VIII shows the results from the B A B AR and Belleanalyses, which are compatible within uncertainties. Theisospin-unconstrained results from B A B AR (Table XIXin Sec. VI.A) show good agreement with the expectedpercent-level degree of isospin breaking. The total uncer-tainty on R ( D ( ∗ ) ) in these measurements is dominated by V.A Experimental Tests of Lepton Flavor Universality and the fit projections are shown in Fig. 3. Figures 4 and 5show the signal-enhanced ( M miss > . GeV =c ) fit pro-jections in E ECL (the most powerful classifier in the neuralnetwork) and p ! l , respectively. In these figures, all back-ground components except D !! background are combinedinto the other-BG component for clarity. The best-fit yieldsare given in Table III.From the fit, the correlation between R and R ! is − . ;each, in turn, is most strongly correlated with the D !! background yields, with 0.1 to 0.2 for R and ≈ . for R ! . IX. SYSTEMATIC UNCERTAINTIES

The dominant systematic uncertainties arise from ourlimited understanding of the D !! background and fromuncertainties in the fixed factors used in the fit. They aresummarized in Table IV and itemized below.In the table, “ D ð!ð!ÞÞ l ν shapes ” refers to uncertainties inthe parameters that are used for the shape reweighting ofsemileptonic decays. The effect on the result is extracted bycreating different sets of weights according to shapehypotheses from varying individual production parameterswithin their σ limits. The D !! background has a strong influence on theextracted yield of the tau signal because the two compo-nents overlap in the M miss spectrum. In addition to theshape uncertainties, there are uncertainties related to thepoorly determined branching fractions to the different D !! states. The fit is therefore repeated several times: twice foreach D !! state, with its branching fractions varied within itsuncertainties. We use the following uncertainties: 42.3%for D ! , 34.6% for D ! , 14.9% for D , 36.2% for D , and100.0% for the radially excited D ð S Þ and D ! ð S Þ . Thebest-fit variations in R are used as systematic uncertainties.They are combined quadratically and quoted in Table IVas “ D !! composition. ” All fixed factors used in the fit are varied by theiruncertainty (arising from the MC sample size). Theinfluence of the uncertainty of these factors is shownindividually in Table IV. Most factors — especially the fixedyields — have little influence on the overall uncertainty; theefficiency ratios f D þ ; and f D !þ ; eff and the cross-feed prob-ability ratios g þ ; give the largest contributions, comparableto the D !! composition and D ð!ð!ÞÞ l ν shape uncertainties.To evaluate the effect of PDF uncertainties, the shapes ofall components are modified and the fit is repeated. The (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s ντ D* → B ντ D → B ν D*l → B ν Dl → Bother BG ν D**l → B (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s ντ D* → B ντ D → B ν D*l → B ν Dl → Bother BG ν D**l → B (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s FIG. 4 (color online). Projections of the fit results and data points with statistical uncertainties in a signal-enhanced region of M miss > . GeV =c in the E ECL dimension. Top left: D þ l − ; top right: D !þ l − ; bottom left: D l − ; bottom right: D ! l − .M. HUSCHLE et al. PHYSICAL REVIEW D increases up to 8% for large values of tan ! =m H ! , and, aswe noted earlier, its uncertainty increases due to the largerdispersion of the weights in the 2HDM reweighting.The variation of the ﬁtted signal yields as a function of tan ! =m H ! is also shown in Fig. 19. The sharp drop in the ! B ! D " " ! " yield at tan ! =m H ! : " is due tothe large shift in the m distribution which occurs when the Higgs contribution begins to dominate the total rate.This shift is also reﬂected in the q distribution and, as wewill see in the next section, the data do not support it. Thechange of the ! B ! D $ " " ! " yield, mostly caused by thecorrelation with the ! B ! D " " ! " sample, is much smaller.Figure 20 compares the measured values of R ð D Þ and R ð D $ Þ in the context of the type II 2HDM to the theoreticalpredictions as a function of tan ! =m H ! . The increase in theuncertainty on the signal PDFs and the efﬁciency ratio as afunction of tan ! =m H ! are taken into account. Other sourcesof systematic uncertainty are kept constant in relative terms.The measured values of R ð D Þ and R ð D $ Þ match thepredictions of this particular Higgs model for tan ! =m H ! ¼ : ! :

02 GeV " and tan ! =m H ! ¼ : ! :

04 GeV " ,respectively. However, the combination of R ð D Þ and E v e n t s / ( M e V ) p = 71.9%: 13.3/17 χ p = 36.2%: 18.4/17 χ E v e n t s / ( M e V ) p = 78.8%: 8.8/13 χ p = 41.8%: 11.3/11 χ E v e n t s / ( M e V ) FIG. 16 (color online). m ES distributions before (left) and after (center) subtraction of normalization of background events, andlepton momentum distributions after this subtraction (right) for events with m > : scaled to the results of the isospin-constrained ﬁt. The B and B þ samples are combined. See Fig. 15 for a legend. SM σ σ σ σ σ FIG. 17 (color online). Representation of $ [Eq. (33)] in the R ð D Þ - R ð D $ Þ plane. The white cross corresponds to the mea-sured R ð D ð$Þ Þ , and the black cross to the SM predictions. Theshaded bands represent one standard deviation each. FIG. 18 (color online). m and j p $ ‘ j projections of the D " ) D ‘ PDF for various values of tan ! =m H ! . FIG. 19 (color online). Left: Variation of the ! B ! D " " ! " (top) and ! B ! D $ " " ! " (bottom) efﬁciency in the 2HDM withrespect to the SM efﬁciency. The band indicates the increase onstatistical uncertainty with respect to the SM value. Right:Variation of the ﬁtted ! B ! D " " ! " (top) and ! B ! D $ " " ! " (bottom) yields as a function of tan ! =m H ! . The band indicatesthe statistical uncertainty of the ﬁt.J. P. LEES et al. PHYSICAL REVIEW D

DℓD * ℓ D *+ ℓ The simultaneous fit over all four data samples hastwelve free parameters: the lepton normalization yield persample, the lepton cross-feed yield per D l − sample, the D !! background yield per sample, and the branching-fraction ratios R ð D Þ and R ð D ! Þ . Here, we assume isospinsymmetry and use the same R ð D Þ and R ð D ! Þ parametersfor the ¯ B and B − samples. VII. CROSS-CHECKS

We compare the measured R ð D ð"Þ Þ to the calculationsbased on the SM, R ð D Þ exp ¼ : % : R ð D " Þ exp ¼ : % : ; R ð D Þ SM ¼ : % : R ð D " Þ SM ¼ : % : ; and observe an excess over the SM predictions for R ð D Þ and R ð D " Þ of : ! and : ! , respectively. We combinethese two measurements in the following way " ¼ ð ! ; ! " Þ ! þ ! exp ! " exp exp ! " exp ! " þ ! " ! ’ !! " ! ; (33)where ! ð"Þ ¼ R ð D ð"Þ Þ exp ’ R ð D ð"Þ Þ th , and is the totalcorrelation between the two measurements, ð R ð D Þ ; R ð D " ÞÞ ¼ ’ : . Since the total uncertainty is dominatedby the experimental uncertainty, the expression in Eq. (33)is expected to be distributed as a " distribution for two degrees of freedom. Figure 17 shows this distribution in the R ð D Þ - R ð D " Þ plane. The contours are ellipses slightlyrotated with respect to the R ð D Þ - R ð D " Þ axes, due to thenonzero correlation.For the assumption that R ð D ð"Þ Þ th ¼ R ð D ð"Þ Þ SM , weobtain " ¼ : , which corresponds to a probability of : ( ’ . This means that the possibility that the mea-sured R ð D Þ and R ð D " Þ both agree with the SM predic-tions is excluded at the : ! level [43]. Recent calculations[7,8,44,45] have resulted in values of R ð D Þ SM that slightlyexceed our estimate. For the largest of those values, thesigniﬁcance of the observed excess decreases to : ! . B. Search for a charged Higgs

To examine whether the excess in R ð D ð"Þ Þ can beexplained by contributions from a charged Higgs bosonin the type II 2HDM, we study the dependence of the ﬁtresults on tan $ =m H % .For 20 values of tan $ =m H % , equally spaced in the ½ : ; : * GeV ’ range, we recalculate the eight signalPDFs, accounting for the charged Higgs contributions asdescribed in Sec. II. Figure 18 shows the m and j p " ‘ j projections of the D %& ) D ‘ PDF for four values of tan $ =m H % . The impact of charged Higgs contributions onthe m distribution mirrors those in the q distribution,see Fig. 3, because of the relation m ¼ ð p e þ e ’ ’ p B tag ’ p D ð"Þ ’ p ‘ Þ ¼ ð q ’ p ‘ Þ ; The changes in the j p " ‘ j distribution are due to the changein the % polarization.We recalculate the value of the efﬁciency ratio " sig =" norm as a function of tan $ =m H % (see Fig. 19). The efﬁciency E n t r i e s / ( . ) E n t r i e s / ( . ) FIG. 14 (color online). Histograms: R ð D ð"Þ Þ distributions re-sulting from 1000 variations of f D "" . Solid curves: Gaussian ﬁtsto the R ð D ð"Þ Þ distributions. E v e n t s / ( M e V ) E v e n t s / ( M e V ) p = 6.1%: 12.1/6 χ p = 50.8%: 6.3/7 χ E v e n t s / ( M e V ) FIG. 15 (color online). E extra distributions for events with m > : scaled to the results of the isospin-unconstrained (ﬁrsttwo columns) and isospin-constrained (last column) ﬁts. The region above the dashed line of the background component correspondsto B " B background and the region below corresponds to continuum. In the third column, the B and B þ samples are combined, and thenormalization and background events are subtracted.MEASUREMENT OF AN EXCESS OF . . . PHYSICAL REVIEW D and the fit projections are shown in Fig. 3. Figures 4 and 5show the signal-enhanced ( M miss > . GeV =c ) fit pro-jections in E ECL (the most powerful classifier in the neuralnetwork) and p ! l , respectively. In these figures, all back-ground components except D !! background are combinedinto the other-BG component for clarity. The best-fit yieldsare given in Table III.From the fit, the correlation between R and R ! is − . ;each, in turn, is most strongly correlated with the D !! background yields, with 0.1 to 0.2 for R and ≈ . for R ! . IX. SYSTEMATIC UNCERTAINTIES

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s ντ D* → B ντ D → B ν D*l → B ν Dl → Bother BG ν D**l → B (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s ντ D* → B ντ D → B ν D*l → B ν Dl → Bother BG ν D**l → B (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s FIG. 4 (color online). Projections of the fit results and data points with statistical uncertainties in a signal-enhanced region of M miss > . GeV =c in the E ECL dimension. Top left: D þ l − ; top right: D !þ l − ; bottom left: D l − ; bottom right: D ! l − .M. HUSCHLE et al. PHYSICAL REVIEW D E v e n t s / ( M e V ) E v e n t s / ( M e V ) E v e n t s / ( M e V ) Belle Belle − − − − − − − − ) E v e n t s / ( . G e V ] [GeV miss2 m Data ντ D → B ντ * D → B ν Dl → B ν l * D → B ν ) τ (l/ ** D → B Bkg. B A B AR (a)(b) D *0 ℓ (c)(d) (e)(f) D ℓD *0 ℓ D + ℓ We compare the measured R ð D ð"Þ Þ to the calculationsbased on the SM, R ð D Þ exp ¼ : % : R ð D " Þ exp ¼ : % : ; R ð D Þ SM ¼ : % : R ð D " Þ SM ¼ : % : ; and observe an excess over the SM predictions for R ð D Þ and R ð D " Þ of : ! and : ! , respectively. We combinethese two measurements in the following way " ¼ ð ! ; ! " Þ ! þ ! exp ! " exp exp ! " exp ! " þ ! " ! ’ !! " ! ; (33)where ! ð"Þ ¼ R ð D ð"Þ Þ exp ’ R ð D ð"Þ Þ th , and is the totalcorrelation between the two measurements, ð R ð D Þ ; R ð D " ÞÞ ¼ ’ : . Since the total uncertainty is dominatedby the experimental uncertainty, the expression in Eq. (33)is expected to be distributed as a " distribution for two degrees of freedom. Figure 17 shows this distribution in the R ð D Þ - R ð D " Þ plane. The contours are ellipses slightlyrotated with respect to the R ð D Þ - R ð D " Þ axes, due to thenonzero correlation.For the assumption that R ð D ð"Þ Þ th ¼ R ð D ð"Þ Þ SM , weobtain " ¼ : , which corresponds to a probability of : ( ’ . This means that the possibility that the mea-sured R ð D Þ and R ð D " Þ both agree with the SM predic-tions is excluded at the : ! level [43]. Recent calculations[7,8,44,45] have resulted in values of R ð D Þ SM that slightlyexceed our estimate. For the largest of those values, thesigniﬁcance of the observed excess decreases to : ! . B. Search for a charged Higgs

To examine whether the excess in R ð D ð"Þ Þ can beexplained by contributions from a charged Higgs bosonin the type II 2HDM, we study the dependence of the ﬁtresults on tan $ =m H % .For 20 values of tan $ =m H % , equally spaced in the ½ : ; : * GeV ’ range, we recalculate the eight signalPDFs, accounting for the charged Higgs contributions asdescribed in Sec. II. Figure 18 shows the m and j p " ‘ j projections of the D %& ) D ‘ PDF for four values of tan $ =m H % . The impact of charged Higgs contributions onthe m distribution mirrors those in the q distribution,see Fig. 3, because of the relation m ¼ ð p e þ e ’ ’ p B tag ’ p D ð"Þ ’ p ‘ Þ ¼ ð q ’ p ‘ Þ ; The changes in the j p " ‘ j distribution are due to the changein the % polarization.We recalculate the value of the efﬁciency ratio " sig =" norm as a function of tan $ =m H % (see Fig. 19). The efﬁciency E n t r i e s / ( . ) E n t r i e s / ( . ) FIG. 14 (color online). Histograms: R ð D ð"Þ Þ distributions re-sulting from 1000 variations of f D "" . Solid curves: Gaussian ﬁtsto the R ð D ð"Þ Þ distributions. E v e n t s / ( M e V ) E v e n t s / ( M e V ) p = 6.1%: 12.1/6 χ p = 50.8%: 6.3/7 χ E v e n t s / ( M e V ) FIG. 15 (color online). E extra distributions for events with m > : scaled to the results of the isospin-unconstrained (ﬁrsttwo columns) and isospin-constrained (last column) ﬁts. The region above the dashed line of the background component correspondsto B " B background and the region below corresponds to continuum. In the third column, the B and B þ samples are combined, and thenormalization and background events are subtracted.MEASUREMENT OF AN EXCESS OF . . . PHYSICAL REVIEW D − − − − − − − − ) E v e n t s / ( . G e V ] [GeV miss2 m Data ντ D → B ντ * D → B ν Dl → B ν l * D → B ν ) τ (l/ ** D → B Bkg. B A B AR BaBar BaBar m ES [GeV] D ℓ E ECL [GeV] E

ECL [GeV] E

ECL [GeV]

Figure 10 Checks on the kinematic distributions for events in the signal enhanced high m region [ m > . for(a-d) and m > for (e-h)]. The solid histogram correspond to the simulation scaled to the ﬁt results. Adapted from(Huschle et al. , 2015; Lees et al. , 2013).Table VII Summary of the relative uncertainties for the BABAR (Lees et al. , 2012) and Belle (Huschle et al. , 2015)measurements of R ( D ( ∗ ) ) with hadronic tagging.Result Contribution Uncertainty [%] Ratio BABAR

BelleSys. Stat. Sys. Stat. R ( D ) B → D ∗∗ lν B → Dlν

Total systematic 9.6 7.1 0.74Total statistical 13.1 17.1 1.31Total 16.2 18.5 1.14 R ( D ∗ ) B → D ∗∗ lν B → D ∗ lν Total systematic 5.6 5.2 0.93Total statistical 7.1 13.0 1.83Total 9.0 14.0 1.56

Table VIII Results of the

BABAR (Lees et al. , 2012) andBelle (Huschle et al. , 2015) measurements of R ( D ( ∗ ) ) withhadronic tagging. The ﬁrst uncertainty is statistical and thesecond systematic.Result BABAR

Belle R ( D ) 0 . ± . ± .

042 0 . ± . ± . R ( D ∗ ) 0 . ± . ± .

018 0 . ± . ± . the statistical uncertainty, so the much larger data sam-ples expected to be collected by Belle II should improvethese results signiﬁcantly.Thorough checks of the stability of these results wereperformed, including separate ﬁts to the muon and elec-tron samples, to the various running periods, and tosamples modiﬁed selection requirements varying the sig-nal over background ratio, S/B , from 1 .

27 to 0 .

27. Inall cases, results were compatible with the nominal re-sult. Additionally, a number of kinematic distributionsof signal-enriched samples were compared with the ﬁttedSM signal plus background model and found good agree-ment overall. Figure 10 shows the distributions for theenergy substituted mass m ES = (cid:113) E − p , whichpeaks at the B mass for correctly reconstructed events,and E ECL . In both cases, the distributions are consistentwith the ﬁtted signal events to be coming from B mesonswith no additional unreconstructed particles in the event.Finally, Fig. 11 shows the measured eﬃciency-corrected q distributions for B → D ( ∗ ) τ ν decays andﬁnds good agreement with the SM expectations. Themeasured distributions are also compared in panels (e-f)with the expectations from the Type-II two-Higgs dou-blet model (2HDM) with tan β/m H ± = 0 .

45 GeV − ,which proceeds primarily via a scalar mediator. The B A B AR analysis recalculates the signal PDFs, reweight-ing the light lepton momentum to approximately accountfor the changes in helicity, for each value of tan β/m H ± and ﬁts the data again, so the data points in Fig. 11(c-d) are somewhat diﬀerent from those in panels (e-f)due to the slightly diﬀerent background and signal cross-feed subtraction. Including systematic uncertainties, thisbenchmark model is excluded at greater than 95% conﬁ-dence level. V.A Experimental Tests of Lepton Flavor Universality IV.A - Hadronic tag q spectrum of ! B ! D ! ! " ! " ! decays is largely independentof tan =m H .The measured q spectra agree with the SM expecta-tions within the statistical uncertainties. For ! B ! D ! " ! " ! decays, there might be a small shift to lower values,which is indicated by the increase in the p value for tan =m H ¼ :

30 GeV " . As we showed in Sec. II B,the average q for tan =m H ¼ :

30 GeV " shifts tolower values because the charged Higgs contribution to ! B ! D ! " ! " ! decays, which always proceeds via an S -wave, interferes destructively with the SM S -wave.As a result, the decay proceeds via an almost pure P -wave and is suppressed at large q by a factor of p D ,thus improving the agreement with data. The negativeinterference suppresses the expected value of R ð D Þ aswell, however, so the region with small tan =m H isexcluded by the measured R ð D Þ .The two favored regions in Fig. 22 with S R þ S L (" : correspond to tan =m H ¼ :

45 GeV " for ! B ! D ! " ! " ! decays. However, as we saw in Fig. 3, the chargedHiggs contributions dominate ! B ! D ! " ! " ! decays forvalues of tan =m H > : " and the q spectrumshifts signiﬁcantly to larger values. The data do notappear to support this expected shift to larger valuesof q .To quantify the disagreement between the measuredand expected q spectra, we conservatively estimate thesystematic uncertainties that impact the distributions shownin Fig. 23 (Appendix). Within these uncertainties, we ﬁndthe variation that minimizes the $ value of those distribu-tions. Table IX shows that, as expected, the conservative uncertainties give rise to large p values in most cases.However, the p value is only 0.4% for ! B ! D ! " ! " ! decaysand tan =m H ¼ :

45 GeV " . Given that this value of tan =m H corresponds to S R þ S L ( " : , we excludethe two solutions at the bottom of Fig. 22 with a signiﬁcanceof at least : % .The other two solutions corresponding to S R þ S L ( : do not impact the q distributions of ! B ! D ! " ! " ! to thesame large degree, and, thus, we cannot exclude them withthe current level of uncertainty. However, these solutionsalso shift the q spectra to larger values due to the S -wave contributions from the charged Higgs boson, sothe agreement with the measured spectra is worse than inthe case of the SM. This is also true for any other solutionscorresponding to complex values of S R and S L .On the other hand, contributions to ! B ! D ! " ! " ! decaysproceeding via P -wave tend to shift the expected q spectra to lower values. Thus, NP processes with spin 1could simultaneously explain the excess in R ð D ð!Þ Þ [21,45] and improve the agreement with the measured q distributions. : 15.1/14, p = 36.9% χ : 6.6/12, p = 88.4% χ ) W e i gh t e d e v e n t s / ( . G e V : 11.0/14, p = 68.6% χ : 6.7/12, p = 87.6% χ ) W e i gh t e d e v e n t s / ( . G e V : 44.5/14, p = 0.0049% χ : 8.1/12, p = 77.4% χ ) W e i gh t e d e v e n t s / ( . G e V FIG. 23 (color online). Efﬁciency corrected q distributions for ! B ! D ! " ! " ! (top) and ! B ! D ! ! " ! " ! (bottom) events with m > : scaled to the results of the isospin-constrained ﬁt. Left: SM. Center: tan =m H ¼ :

30 GeV " . Right: tan =m H ¼ :

45 GeV " . The points and the shaded histograms correspond to the measured and expected distributions, respectively. The B and B þ samples are combined and the normalization and background events are subtracted. The distributions are normalized to the numberof detected events. The uncertainty on the data points includes the statistical uncertainties of data and simulation. The values of $ arebased on this uncertainty. TABLE IX. Maximum p value for the q distributions inFig. 23 corresponding to the variations due to the systematicuncertainties. ! B ! D ! " ! " ! ! B ! D ! ! " ! " ! SM 83.1% 98.8% tan =m H ¼ :

30 GeV " tan =m H ¼ :

45 GeV " et al. PHYSICAL REVIEW D D l − samples and the D !þ l − and D ! l − samples arecombined to increase the available statistics, then the fullprocedure is repeated using the assumptions for the τ signalin a type II 2HDM model with tan β =m H þ ¼ . c = GeV.Figure 8 shows the measured background-subtracted andefficiency-corrected q distributions for the SM and the NPpoint. As the signal yields are not extracted from fits toindividual q bins, the data distribution depends slightly onthe signal model; the signal model can affect the back-ground yields in the fit to uncorrected data, which are thensubtracted. A χ test shows that both hypotheses arecompatible with our data with p -values for the SMdistribution of 64% ( D τ − ¯ ν τ ) and 11% ( D ! τ − ¯ ν τ ), and forthe NP distribution of 53% ( D τ − ¯ ν τ ) and 49% ( D ! τ − ¯ ν τ ). XI. CONCLUSION

We present a measurement of the relative branchingratios R ð D ð!Þ Þ of ¯ B → D ð!Þ τ − ¯ ν τ to ¯ B → D ð!Þ l − ¯ ν l using thefull ϒ ð S Þ data recorded with the Belle detector. Theresults are R ð D Þ ¼ . & . ð stat Þ & . ð syst Þ R ð D ! Þ ¼ . & . ð stat Þ & . ð syst Þ : In comparison to our previous preliminary results [9],which are superseded by this measurement, we utilize a more sophisticated fit strategy with an improved handlingof the background from ¯ B → D !! l − ¯ ν l events, impose anisospin constraint, and exploit a much higher taggingefficiency. By these methods, we reduce the statisticaluncertainties by about a third and the systematic uncer-tainties by more than a half.Our result lies between the SM expectation and themost recent measurement from the BABAR collaboration[11] and is compatible with both. It is also compatiblewith a 2HDM of type II in the region aroundtan β =m H þ ¼ . c = GeV, as illustrated in Figs. 7 and 8.

ACKNOWLEDGMENTS

We thank the KEKB group for the excellent operation ofthe accelerator; the KEK cryogenics group for the efficientoperation of the solenoid; and the KEK computer group, theNational Institute of Informatics, and the PNNL/EMSLcomputing group for valuable computing and SINET4 net-worksupport. We acknowledgesupport from the MinistryofEducation, Culture, Sports, Science, and Technology(MEXT) of Japan, the Japan Society for the Promotion ofScience(JSPS),andtheTau-LeptonPhysicsResearchCenterof Nagoya University; the Australian Research Counciland the Australian Department of Industry, Innovation,Science and Research; Austrian Science Fund underGrants No. P 22742-N16 and No. P 26794-N20; the ) /c (GeV q4 5 6 7 8 9 10 11 12 E v en t s ( a r b i t r a r y un i t s ) /c (GeV q4 5 6 7 8 9 10 11 12 E v en t s ( a r b i t r a r y un i t s ) /c (GeV q4 5 6 7 8 9 10 11 12 E v en t s ( a r b i t r a r y un i t s ) − /c (GeV q4 5 6 7 8 9 10 11 12 E v en t s ( a r b i t r a r y un i t s ) − FIG. 8 (color online). Background-subtracted q distributions of the τ signal in the region of M miss > . GeV =c . The distributionsare efficiency corrected and normalized to the fitted yield. The error bars show the statistical uncertainties. The histogram is therespective expected distribution from signal MC. Left: Standard Model result, right: Type-II 2HDM result withtan β =m H þ ¼ . c = GeV, top: ¯ B → D τ − ¯ ν τ , bottom: ¯ B → D ! τ − ¯ ν τ MEASUREMENT OF THE BRANCHING RATIO OF … PHYSICAL REVIEW D D l − samples and the D !þ l − and D ! l − samples arecombined to increase the available statistics, then the fullprocedure is repeated using the assumptions for the τ signalin a type II 2HDM model with tan β =m H þ ¼ . c = GeV.Figure 8 shows the measured background-subtracted andefficiency-corrected q distributions for the SM and the NPpoint. As the signal yields are not extracted from fits toindividual q bins, the data distribution depends slightly onthe signal model; the signal model can affect the back-ground yields in the fit to uncorrected data, which are thensubtracted. A χ test shows that both hypotheses arecompatible with our data with p -values for the SMdistribution of 64% ( D τ − ¯ ν τ ) and 11% ( D ! τ − ¯ ν τ ), and forthe NP distribution of 53% ( D τ − ¯ ν τ ) and 49% ( D ! τ − ¯ ν τ ). XI. CONCLUSION

ACKNOWLEDGMENTS

We thank the KEKB group for the excellent operation ofthe accelerator; the KEK cryogenics group for the efficientoperation of the solenoid; and the KEK computer group, theNational Institute of Informatics, and the PNNL/EMSLcomputing group for valuable computing and SINET4 net-worksupport. We acknowledgesupport from the MinistryofEducation, Culture, Sports, Science, and Technology(MEXT) of Japan, the Japan Society for the Promotion ofScience(JSPS),andtheTau-LeptonPhysicsResearchCenterof Nagoya University; the Australian Research Counciland the Australian Department of Industry, Innovation,Science and Research; Austrian Science Fund underGrants No. P 22742-N16 and No. P 26794-N20; the ) /c (GeV q4 5 6 7 8 9 10 11 12 E v en t s ( a r b i t r a r y un i t s ) /c (GeV q4 5 6 7 8 9 10 11 12 E v en t s ( a r b i t r a r y un i t s ) /c (GeV q4 5 6 7 8 9 10 11 12 E v en t s ( a r b i t r a r y un i t s ) − /c (GeV q4 5 6 7 8 9 10 11 12 E v en t s ( a r b i t r a r y un i t s ) − FIG. 8 (color online). Background-subtracted q distributions of the τ signal in the region of M miss > . GeV =c . The distributionsare efficiency corrected and normalized to the fitted yield. The error bars show the statistical uncertainties. The histogram is therespective expected distribution from signal MC. Left: Standard Model result, right: Type-II 2HDM result withtan β =m H þ ¼ . c = GeV, top: ¯ B → D τ − ¯ ν τ , bottom: ¯ B → D ! τ − ¯ ν τ MEASUREMENT OF THE BRANCHING RATIO OF … PHYSICAL REVIEW D spectrum of ! B ! D ! ! " ! " ! decays is largely independentof tan =m H .The measured q spectra agree with the SM expecta-tions within the statistical uncertainties. For ! B ! D ! " ! " ! decays, there might be a small shift to lower values,which is indicated by the increase in the p value for tan =m H ¼ :

30 GeV " . As we showed in Sec. II B,the average q for tan =m H ¼ :

30 GeV " . Right: tan =m H ¼ :

30 GeV " tan =m H ¼ :

45 GeV " et al. PHYSICAL REVIEW D W e i gh t e d e v e n t s / ( . G e V ) W e i gh t e d e v e n t s / ( . G e V ) W e i gh t e d e v e n t s / ( . G e V ) DℓD * ℓ Belle − − − − − − − − ) E v e n t s / ( . G e V ] [GeV miss2 m Data ντ D → B ντ * D → B ν Dl → B ν l * D → B ν ) τ (l/ ** D → B Bkg. B A B AR (a)(b) (c)(d) (e)(f) Dℓ − − − − − − − − ) E v e n t s / ( . G e V ] [GeV miss2 m Data ντ D → B ντ * D → B ν Dl → B ν l * D → B ν ) τ (l/ ** D → B Bkg. B A B AR Dℓ BaBar BaBar q [GeV] q [GeV] q [GeV] SM SM 2HDM

Figure 11 Eﬃciency corrected q distributions for B → Dτ ν (top) and B → D ∗ τ ν (bottom) events with m > .

85 GeV (a-b) and m > . (c-f). The shaded distributions correspond to the SM expectations in (a-d) and a Type-II 2HDMwith tan β/m H ± = 0 .

45 GeV − in (e-f). The χ values are calculated based on the statistical uncertainties only. Adapted from(Huschle et al. , 2015; Lees et al. , 2013).

2. Search for B → πτ ν decays Charmless semitauonic decays oﬀer an interesting, in-dependent probe of LFUV to complement the excessesobserved in various R ( D ( ∗ ) ) measurements. Althoughthey involve diﬀerent four-Fermi operators, and are CKMsuppressed, they also oﬀer access to third generationsemileptonic decays in an experimental setting with verydiﬀerent background composition. The most promisingcandidate for a ﬁrst observation is the B → πτ ν channel.Further, even modest precision could already stronglyconstrain new physics models involving scalar mediatorssuch as the Type-II 2HDM (Bernlochner, 2015).A ﬁrst limit on the branching fraction of this decay wasobtained by Belle in 2015 (Hamer et al. , 2016), whichfollowed a similar strategy to that employed by Belle’shadronic tag measurement of R ( D ( ∗ ) ). For the B → πτ ν analysis, B tag mesons are selected only when the bestcandidate is compatible with the decay of a neutral B meson. In order to boost the reconstructed number of B → πτ ν signal decays, both electronic τ → eνν as wellas hadronic one-prong τ → πν and τ → ρν decays wereincluded in the reconstruction. The signal side is thusrequired to have at most two oppositely charged tracks,with one of those tracks having a particle identiﬁcationcompatible with an electron in the case of τ → eνν de-cays. For the ρ + → π + π reconstruction, neutral pioncandidates, which are not used in the tag-reconstruction,are constructed from neutral energy depositions in thecalorimeter. If multiple ρ candidates exist, the one witha mass closest to the nominal ρ mass is kept. In order toreduce background from B → X c (cid:96)ν decays, events with K L candidates are vetoed. Such candidates are identi-ﬁed as a cluster in the outer K L and muon detector with no energy depositions in the electromagnetic calorimeternear the ﬂight path of the K L candidate.With all particles assigned to either the tag or sig-nal side, E ECL can be reconstructed from the remainingneutral clusters in the collision event. To further reducebackgrounds, three boosted decision trees are trained:one for each probed τ decay mode. The input variablesare: • The four-momenta of all signal particles • q as calculated from the tag-side B meson four-momentum and the signal-side pion with the high-est momentum; for signal decays q ≥ m τ , whereasfor backgrounds lower values are possible. • m ; for signal decays we expect a higher missingmass because of the additional neutrinos in the ﬁnalstate.Requirements on the classiﬁer outputs are chosen to se-lect signal events such that each channel has an opti-mal statistical sensitivity. The resulting number of sig-nal events is then extracted via a simultaneous ﬁt of therespective E ECL distributions. The post-ﬁt distributionsare shown in Fig. 12. The measurement quotes an upperlimit of B ( B → πτ ν ) < . × − at 90% CL. This canbe converted to a value of R ( π ) = 1 . ± . , (48)which can be compared to the SM expectation of R ( π ) SM = 0 . ± .

016 (Bernlochner, 2015).Table IX shows an overview of the systematic uncer-tainties of the result. The largest systematic uncertain-ties stem from the tagging calibration, as the measure-ment was not carried out as a ratio with respect to the

V.B Experimental Tests of Lepton Flavor Universality [GeV] ECL E E v en t s / . G e V DataSignal c X → Bfixed BG (a) ⌧ ! e⌫⌫ [GeV] ECL E E v en t s / . G e V DataSignal c X → Bfixed BG (b) ⌧ ! ⇡⌫ [GeV] ECL E E v en t s / . G e V DataSignal c X → Bfixed BG (c) ⌧ ! ⇢⌫ FIG. 3: Distributions of E ECL in the three ⌧ reconstruction modes. The signal and b ! c contributions are scaledaccording to the ﬁt result.the upper limit. First, the likelihood is ﬁtted to data toobtain the maximum likelihood estimates (MLEs) of allnuisance parameters on data. In each pseudo-experimentgeneration, the nuisance parameters are ﬁxed to theirrespective MLE. In the subsequent maximization of thelikelihood, the nuisance parameters are free parameters.The global observables are randomized in each pseudo-experiment.Using pseudo-experiments, the p -value of thebackground-only hypothesis for data is determinedand the signiﬁcance level Z is computed in terms ofstandard deviations as Z = (1 p ) , where is the cumulative distribution function of thestandard normal Gaussian.We observe a signal signiﬁcance of 2 . , not includ-ing systematic uncertainties in the calculation. Includingall relevant systematic e↵ects results in a signiﬁcance of2 . . For this result, the test statistic has been computedon 10 000 background-only pseudo-experiments. Given the level of signiﬁcance of these results, we invertthe hypothesis test and compute an upper limit on thebranching fraction. pseudo-experiments are generatedfor di↵erent signal strength parameters for both signal-plus-background and background-only hypotheses in or-der to obtain CL s + b and CL b , respectively. The upperlimit is then computed using CL s = CL s + b /CL b [43],where a scan over reasonable signal strength parame-ter values is performed. At each step, 10 000 pseudo-experiments have been evaluated for both hypotheses.At the 90% conﬁdence level, we obtain an upperlimit of B B ! ⇡ ⌧ + ⌫ ⌧ < . ⇥ . The upperlimit at the 95% conﬁdence level has been computed to B B ! ⇡ ⌧ + ⌫ ⌧ < . ⇥ . This result is the ﬁrstresult on B B ! ⇡ ⌧ + ⌫ ⌧ and is in good agreementwith the SM prediction. 8 [GeV] ECL E E v en t s / . G e V DataSignal c X → Bfixed BG (a) ⌧ ! e⌫⌫ [GeV] ECL E E v en t s / . G e V DataSignal c X → Bfixed BG (b) ⌧ ! ⇡⌫ [GeV] ECL E E v en t s / . G e V DataSignal c X → Bfixed BG (c) ⌧ ! ⇢⌫ FIG. 3: Distributions of E ECL in the three ⌧ reconstruction modes. The signal and b ! c contributions are scaledaccording to the ﬁt result.the upper limit. First, the likelihood is ﬁtted to data toobtain the maximum likelihood estimates (MLEs) of allnuisance parameters on data. In each pseudo-experimentgeneration, the nuisance parameters are ﬁxed to theirrespective MLE. In the subsequent maximization of thelikelihood, the nuisance parameters are free parameters.The global observables are randomized in each pseudo-experiment.Using pseudo-experiments, the p -value of thebackground-only hypothesis for data is determinedand the signiﬁcance level Z is computed in terms ofstandard deviations as Z = (1 p ) , where is the cumulative distribution function of thestandard normal Gaussian.We observe a signal signiﬁcance of 2 . , not includ-ing systematic uncertainties in the calculation. Includingall relevant systematic e↵ects results in a signiﬁcance of2 . . For this result, the test statistic has been computedon 10 000 background-only pseudo-experiments. Given the level of signiﬁcance of these results, we invertthe hypothesis test and compute an upper limit on thebranching fraction. pseudo-experiments are generatedfor di↵erent signal strength parameters for both signal-plus-background and background-only hypotheses in or-der to obtain CL s + b and CL b , respectively. The upperlimit is then computed using CL s = CL s + b /CL b [43],where a scan over reasonable signal strength parame-ter values is performed. At each step, 10 000 pseudo-experiments have been evaluated for both hypotheses.At the 90% conﬁdence level, we obtain an upperlimit of B B ! ⇡ ⌧ + ⌫ ⌧ < . ⇥ . The upperlimit at the 95% conﬁdence level has been computed to B B ! ⇡ ⌧ + ⌫ ⌧ < . ⇥ . This result is the ﬁrstresult on B B ! ⇡ ⌧ + ⌫ ⌧ and is in good agreementwith the SM prediction. E v en t s / ( . G e V ) E v en t s / ( . G e V ) E v en t s / ( . G e V ) E ECL [GeV] E

ECL [GeV] E

ECL [GeV]

IV.A - Hadronic tag B → π τ ν τ + → e + ν τ ν e τ + → π + ν τ τ + → ρ + ν τ Belle Belle Belle

Figure 12 Signal ﬁt for the Belle measurement of B → πτ ν decays, adapted from (Hamer et al. , 2016). The E ECL distributionsfor the three reconstructed τ decay modes are shown: (left) τ → eνν , (middle) τ → πν , and (right) τ → ρν .Table IX Summary of the relative uncertainties for the mea-surement of B → πτ ν decays by Belle (Hamer et al. , 2016).Contribution Uncertainty [%]Sys. Stat. B → X c (cid:96)ν K L veto 3.2Particle ID 2.4Bkg. modeling 4.4Other 3.2 Total systematic 8.3Total statistical 48Total 49 light-lepton mode. The K L veto, used to reduce thebackground from CKM favored semileptonic decays, in-troduces a large uncertainty due to the poorly known K L reconstruction eﬃciency. B. Belle measurements with semileptonic tags R ( D ( ∗ ) ) with τ → (cid:96)νν The ﬁrst measurement of R ( D ∗ ) using semileptonictagging was performed by Belle (Sato et al. , 2016), aresult that was subsequently superseded by Belle’s com-bined measurement of R ( D ) and R ( D ∗ ) in 2020 (Caria et al. , 2020). This analysis employs the FEI algorithm(described in Sec. III.C.1) to eﬃciently identify semilep-tonic B meson decays of the second B meson ( B tag ) inthe event. This allows for the full identiﬁcation of allparticles and decay cascades in the collision event andthe reliable reconstruction of E ECL , the unassigned en-ergy in the calorimeter, as already deﬁned in Sec. IV.A.

Tag-side B → D ( ∗ ) (cid:96)ν decays are selected by exploitingthe observablecos θ B,D ( ∗ ) (cid:96) ≡ E beam E D ( ∗ ) (cid:96) − m B − m D ( ∗ ) (cid:96) | p B || p D ( ∗ ) (cid:96) | , (49) in which the energies and momenta, E and p , are alldeﬁned in the centre-of-mass (CM) frame—the Υ (4 S )rest frame—of the colliding beams. In particular, notethat E D ( ∗ ) (cid:96) and p D ( ∗ ) (cid:96) are the energy and momentumof the D ( ∗ ) (cid:96) system, respectively, and that in this frame E beam = E B . For B → D ( ∗ ) (cid:96)ν decays with a single ﬁnalstate neutrino, which satisfy ( p B − p D ( ∗ ) (cid:96) ) = m ν (cid:39) θ B,D ∗ (cid:96) corresponds to the cosine ofthe angle between the tag B meson and D ( ∗ ) (cid:96) system inthe CM frame. Thus, for correctly reconstructed tag-side B → D ( ∗ ) (cid:96)ν decays, the right hand side of Eq. (49) fallsin the physical region such that − ≤ cos θ B,D ∗ (cid:96) ≤ B → D ∗∗ (cid:96)ν or semitauonic B → D ( ∗ ) τ ( → (cid:96)νν ) ν decays, the right hand side of Eq. (49) will typi-cally produce large negative values due to the absent term( p B − p D ( ∗ ) (cid:96) ) / | p B || p D ( ∗ ) (cid:96) | >

0, needed for cos θ B,D ( ∗ ) (cid:96) to represent a physical cosine. Including ﬁnite resolu-tion eﬀects, a requirement of cos θ B,D ∗ (cid:96) ∈ [ − ,

1] thuscaptures most tag B → D ( ∗ ) (cid:96)ν decays, while stronglysuppressing B → D ∗∗ (cid:96)ν and B → D ( ∗ ) τ ( → (cid:96)νν ) ν de-cays.On the signal side, lepton candidates are combinedwith D and D ∗ meson candidates. The decay modes usedfor the D and D + account for about 30% and 22%, re-spectively, of the overall decay branching fractions. Tofurther improve the reconstruction, a decay vertex ﬁt ofthe D daughter particles is carried out. The D ∗ + is re-constructed using both charged and neutral slow pioncandidates, and for the D ∗ neutral slow pion candidatesand photons are used. The selection is reﬁned by apply-ing requirements on the masses of these candidates andother variables that are optimized to maximize the sta-tistical signiﬁcance of the ﬁnal result. In case severaltag and signal-side candidates can be reconstructed, thecandidate combination with the highest tagging classi-ﬁer output from the FEI , and on the signal side with thebest D vertex ﬁt probability, is selected. Events with ad-ditional unassigned charged particles or displaced tracksare rejected. At this stage, all signal and tag-side parti- V.C Experimental Tests of Lepton Flavor Universality E ECL can be reconstructed. Here,only clusters in the barrel, forward region and backwardregion with energies greater than 50, 100, and 150 MeV,respectively are included. For correctly reconstructednormalization and signal decays, one expects no unas-signed neutral depositions in the detector and that E ECL peaks at zero with a tail towards positive values due toreconstruction mistakes on the tag-side, and to a lesserextent due to beam-background depositions and noise inthe calorimeter.To separate signal and normalization mode decays, aboosted decision tree is trained with the following distin-guishing features ranked in order of importance: • Signal side cos θ B,D ∗ (cid:96) : for normalization mode de-cays this variable will be in the physical range of[ − , • Approximate missing mass squared, m (moredetails in Sec. III.C): the additional two neutri-nos from the τ decay will produce on average alarger missing invariant mass than the normaliza-tion mode. • The total visible energy E vis = (cid:80) i E i of all recon-structed particles i in the event: the two additionalneutrinos from the signal mode also will reduce thevisible energy observed in the detector in contrastto the normalization mode.The classiﬁer output O sig is then directly ﬁtted alongwith the E ECL of the event to disentangle signal, nor-malization, and background contributions. This is doneby exploiting the isospin relations between the chargedand neutral ﬁnal states for the normalization and signalcontributions, i.e. ﬁxing R ( D ( ∗ ) 0 ) = R ( D ( ∗ ) + ). Thefree parameters of the ﬁt are the yields for the signal,normalization, B → D ∗∗ lν , and feed-down from D ( ∗ ) (cid:96) components. The yields of other background contribu-tions from continuum and B meson decays are kept ﬁxedto their expectation values.Figure 13 shows the full post-ﬁt projections of E ECL aswell as those in the signal enriched region of O sig > . R ( D ) = 0 . ± .

037 (stat) ± .

016 (syst) , (50) R ( D ∗ ) = 0 . ± .

018 (stat) ± .

014 (syst) , (51)with the ﬁrst error being statistical and the secondfrom systematic uncertainties, and an anti-correlation of ρ = − .

52 between both values. The measurement is themost precise determination of these ratios to date andshows a good compatibility with the SM expectation.Table X summarizes the relative systematic and sta-tistical uncertainties on R ( D ) and R ( D ∗ ). The limitedsize of the simulated sample, used to deﬁne the ﬁt tem-plates and to train the multivariate selection, results in Table X Summary of the relative uncertainties for the Bellemeasurement of R ( D ( ∗ ) ) using semileptonic tagging (Caria et al. , 2020).Result Contribution Uncertainty [%]Sys. Stat. R ( D ) B → D ∗∗ (cid:96) ¯ ν (cid:96) (cid:15) sig /(cid:15) norm Total systematic 5.2Total statistical 12.1Total 13.1 R ( D ∗ ) B → D ∗∗ (cid:96) ¯ ν (cid:96) (cid:15) sig /(cid:15) norm Total systematic 4.9Total statistical 6.4Total 8.1 the dominant systematic uncertainty. Uncertainties fromlepton eﬃciencies and fake rates cancel only to some ex-tent in the measured ratios because of the large diﬀer-ences in the momentum spectra of signal and normal-ization decays. This leads to a sizeable uncertainty ofthe eﬃciency ratios (cid:15) sig /(cid:15) norm . Uncertainties from the B → D ∗∗ lν background are less dominant. C. LHCb untagged measurements

The measurement of decays with multiple neutrinos inthe ﬁnal state is especially challenging at hadron collidersgiven the typically smaller signal-to-background ratioscompared to the B -factories and the inability to eﬀec-tively reconstruct a tag b -hadron to constrain the kine-matics of the signal decay. These diﬃculties have beenovercome by taking advantage of the large data samplesof b -hadrons produced in high-energy pp collisions and bycleverly estimating the kinematics of the signal b -hadronbased on the particles that can be reconstructed. Themeasurements described in Secs. IV.C.1 and IV.C.3 makeuse of the relatively clean muonic decays of the τ leptonto limit the background contributions and estimate the B or B c kinematics with the so-called rest frame approx-imation (see Sec. III.C.3). The measurement detailed inSec. IV.C.2 takes advantage of the additional vertex thatcan be reconstructed from τ → π − π + π − ν hadronic de-cays to not only reduce hadronic backgrounds by four or-ders of magnitude, but also to estimate the momentum ofthe signal B meson relatively precisely (see Sec. III.C.2). V.C Experimental Tests of Lepton Flavor Universality IV.B - Semileptonic tag signal fit (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → B ν τ D* → BFake D (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → + B ν D* l → B ν τ D* → + B ν τ D* → BFake D × (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D* (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D*

FIG. 1. E ECL ﬁt projections and data points with statistical uncertainties in the D + ` (top left), D ` (top right), D ⇤ + ` (bottom left) and D ⇤ ` (bottom right) samples, for the full classiﬁer region. The signal region, deﬁned by the selection O cls > .

9, is shown in the inset. of the tagging algorithm between data and MC simula-tion.The E ECL projections of the ﬁt are shown in Fig. 1.The ﬁt ﬁnds R ( D ) = 0 . ± .

037 and R ( D ⇤ ) = 0 . ± . R ( D ( ⇤ ) ), we vary each ﬁxed parameter 500 times,sampling from a Gaussian distribution built using thevalue and uncertainty of the parameter. For each varia-tion, we repeat the ﬁt. The associated systematic uncer-tainty is taken as the standard deviation of the resultingdistribution of ﬁtted results. The systematic uncertain-ties are listed in Table I.In Table I the label “ D ⇤⇤ composition” refers to theuncertainty introduced by the branching fractions of the B ! D ⇤⇤ `⌫ ` channels and the decays of the D ⇤⇤ mesons,which are not well known and hence contribute signiﬁ-cantly to the total PDF uncertainty. The uncertaintieson the branching fraction of B ! D ⇤⇤ `⌫ ` are assumed tobe ±

6% for D , ±

10% for D ⇤ , ±

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → B ν τ D* → BFake D (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → + B ν D* l → B ν τ D* → + B ν τ D* → BFake D × (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D* (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D*

9, is shown in the inset. of the tagging algorithm between data and MC simula-tion.The E ECL projections of the ﬁt are shown in Fig. 1.The ﬁt ﬁnds R ( D ) = 0 . ± .

6% for D , ±

10% for D ⇤ , ±

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → B ν τ D* → BFake D (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → + B ν D* l → B ν τ D* → + B ν τ D* → BFake D × (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D* (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D*

9, is shown in the inset. of the tagging algorithm between data and MC simula-tion.The E ECL projections of the ﬁt are shown in Fig. 1.The ﬁt ﬁnds R ( D ) = 0 . ± .

6% for D , ±

10% for D ⇤ , ±

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → B ν τ D* → BFake D (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → + B ν D* l → B ν τ D* → + B ν τ D* → BFake D × (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D* (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D*

9, is shown in the inset. of the tagging algorithm between data and MC simula-tion.The E ECL projections of the ﬁt are shown in Fig. 1.The ﬁt ﬁnds R ( D ) = 0 . ± .

6% for D , ±

10% for D ⇤ , ±

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → B ν τ D* → BFake D (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → + B ν D* l → B ν τ D* → + B ν τ D* → BFake D × (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D* (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D*

9, is shown in the inset. of the tagging algorithm between data and MC simula-tion.The E ECL projections of the ﬁt are shown in Fig. 1.The ﬁt ﬁnds R ( D ) = 0 . ± .

6% for D , ±

10% for D ⇤ , ±

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → B ν τ D* → BFake D (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D → B ν D l → B ν D** l → B Other ν D* l → + B ν D* l → B ν τ D* → + B ν τ D* → BFake D × (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D* (GeV)

ECL

E0 0.2 0.4 0.6 0.8 1 1.2 E v en t s / ( . G e V ) × ν τ D* → B ν D* l → B ν D** l → B OtherFake D*

9, is shown in the inset. of the tagging algorithm between data and MC simula-tion.The E ECL projections of the ﬁt are shown in Fig. 1.The ﬁt ﬁnds R ( D ) = 0 . ± .

6% for D , ±

10% for D ⇤ , ±

83% for D , and ± D ⇤ , while the uncertainties on each of the D ⇤⇤ de-cay branching fractions are conservatively assumed to be ± D ( ⇤ ) events, B ! D ⇤⇤ `⌫ ` , feed-down, and other back-grounds by generating toy MC samples from the nominalPDFs according to Poisson statistics, and then repeatingthe ﬁt with the new PDFs. Secondly, the reconstruc-tion eciency of feed-down events, together with the ef-ﬁciency ratio of signal to normalization events, are variedwithin their uncertainties, which are limited by the sizeof the MC samples as well.The eciency factors for the fake D ( ⇤ ) and B tag re-construction are calibrated using collision data. The un-certainties on these factors are a↵ected by the size ofthe samples used in the calibration. We vary the factorswithin their errors and extract associated systematic un-certainties.The e↵ect of the lepton eciency and fake rate, aswell as that due to the slow pion eciency, do not can-cel out in the R ( D ( ⇤ ) ) ratios. This is due to the dif- Belle D ℓ Belle D + ℓ D *+ ℓD *0 ℓ Figure 13 Projection of the signal ﬁt for the Belle measurement of R ( D ( ∗ ) ) using semileptonic tagging, adapted from (Caria et al. , 2020). The four panels correspond to the four reconstruction categories: (top left) D + (cid:96) , (top right) D (cid:96) , (bottom left) D ∗ + (cid:96) , (bottom right) D ∗ (cid:96) . The signal enriched regions, obtained by a cut on a multivariate classiﬁer, are shown in the insetﬁgures. The uncertainties are only statistical. R ( D ∗ + ) with τ → µνν The LHCb experiment published the ﬁrst measure-ment of a b → cτ ν transition in a hadron collider environ-ment in 2015 (Aaij et al. , 2015c). This result was basedon a 3 fb − sample of pp collision data and measured R ( D ∗ + ), which under isospin symmetry has the samevalue as R ( D ∗ ) to a very good approximation. This ﬁrstanalysis chose to focus on R ( D ∗ ) over R ( D ) because thelower B → Dτ ν branching fraction, the lack of the D ∗ mass constraint, and the larger contributions from feed-down processes make R ( D ) a signiﬁcantly more challeng-ing observable to measure at a hadron collider. A com-bined R ( D )– R ( D ∗ ) measurement from LHCb is expectedin 2021.Signal B → D ∗ + τ − ν τ and normalization B → D ∗ + µ − ν µ decays are selected by requiring that the tra-jectories of a µ − and an oppositely charged D ∗ + can-didate, reconstructed exclusively via the decay chain D ∗ + → D ( → K − π + ) π + , are consistent with a com-mon vertex that is separated from the pp primary vertex(PV). Events with an electron in the ﬁnal state are not in-cluded because of the trigger and calorimeter limitationsdescribed in Sec. III.B. Compared to the B -factories, thereduction in signal reconstruction eﬃciency due to theexclusive use of muons and a single D decay chain iscompensated by the far larger production cross-sectionfor B mesons at LHCb.An isolation boosted decision tree (BDT) is trainedto reject events arising from partially reconstructed B decays. For each additional track in the event this algo-rithm evaluates the possibility that the track originatesfrom the same vertex as the D ∗ + µ − candidate based onquantities such as the track separation from the decayvertex and the angle between the track and the candi-date momentum vector. The signal sample is made up ofevents where the D ∗ + µ − candidate is found to be isolatedfrom all other tracks in the event.The isolation BDT is employed to further select threedata control samples: a D ∗ + µ − K ± sample that includesan additional kaon coming from the D ∗ + µ − vertex, aswell as the D ∗ + µ − π − and D ∗ + µ − π − π + samples withan additional pion and pion pair, respectively. The D ∗ + µ − K ± sample is enriched in double-charm decays ofthe type B → D ∗ + H c X , where H c is a charmed hadronthat decays semileptonically and X refers to unrecon-structed particles, while the samples with additional pi-ons are enriched in B → D ∗∗ lν decays. Additional datacontrol samples based on wrong charge combinations ofthe D ∗ + , D ∗ + decay products and muon are used to mea-sure the combinatorial background. The misidentiﬁedmuon background is estimated in a D ∗ + h ± sample where h ± is a track that fails the muon identiﬁcation require-ments.A three-dimensional binned maximum likelihood ﬁt tothe q , m (Eq. (44)), and E ∗ (cid:96) (Eq. (45)) variables isperformed to determine the signal, normalization, andbackground yields, as well as several parameters describ-ing the shapes of the diﬀerent distributions. The momen-tum of the B meson, necessary to calculate the three ﬁt V.C Experimental Tests of Lepton Flavor Universality ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10100200300400500 LHCb /c < 2.85 GeV − ) / c C a nd i d a t e s / ( . G e V ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10200400600 LHCb /c < 6.10 GeV ) / c C a nd i d a t e s / ( . G e V ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10200400600800100012001400 LHCb /c < 9.35 GeV ) / c C a nd i d a t e s / ( . G e V ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10500100015002000 LHCb /c < 12.60 GeV ) / c C a nd i d a t e s / ( . G e V * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 250020406080100120 LHCb /c < 2.85 GeV − C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 2500100200300400 LHCb /c < 6.10 GeV C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 250050010001500 LHCb /c < 9.35 GeV C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 2500100020003000 LHCb /c < 12.60 GeV C a nd i d a t e s / ( M e V ) Data ντ D* → B X')X ν l → ( c D*H → B ν D**l → B ν µ D* → B Combinatorial µ Misidentified

Figure 4: Results of ﬁtting control data enriched in B ! D ⇤ + H c ( ! µ⌫X ) X (green). Thesample is selected by requiring the isolation MVA identify a track consistent with originatingfrom the B candidate vertex and at least one track consistent with the K ± hypothesis near the B . Shown are projections in (left) m and (right) E ⇤ µ for each bin of q . [GeV ] m miss [GeV] E * ℓ (c) (d) ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10100200300400 LHCb /c < 2.85 GeV − ) / c C a nd i d a t e s / ( . G e V ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10200400600800 LHCb /c < 6.10 GeV ) / c C a nd i d a t e s / ( . G e V ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10200400600800 LHCb /c < 9.35 GeV ) / c C a nd i d a t e s / ( . G e V ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 1050100150 LHCb /c < 12.60 GeV ) / c C a nd i d a t e s / ( . G e V * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 250020406080100 LHCb /c < 2.85 GeV − C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 2500100200300400500 LHCb /c < 6.10 GeV C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 2500200400600800 LHCb /c < 9.35 GeV C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 250050100150200 LHCb /c < 12.60 GeV C a nd i d a t e s / ( M e V ) Data ντ D* → B X')X ν l → ( c D*H → B ν D**l → B ν µ D* → B Combinatorial µ Misidentified

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 250020406080100 LHCb /c < 2.85 GeV − C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 2500100200300400500 LHCb /c < 6.10 GeV C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 2500200400600800 LHCb /c < 9.35 GeV C a nd i d a t e s / ( M e V ) * (MeV) µ E

8 J U N E 2 0 1 7 | V O L 5 4 6 | N A T U R E | 2 3 1

In the standard model, these B decays are mediated by a virtual charged vector boson, a particle of spin 1, usually referred to as the W − (as indi-cated in the diagram in Fig. 1), which couples equally to all leptons. If a hitherto unknown virtual particle existed that interacted differently with leptons of higher mass such as the τ , this could change the B decay rates and their kinematics.Among the simplest explanations for the observed rate increases for decays involving τ − would be the existence of a new vector boson, W ′ − , similar to the standard model W − boson, but with a greater mass, and with couplings of varying strengths to different leptons and quarks. This could lead to changes in R D and ∗ R D , but not in the kinematics of the decays, which are observed to be consistent with the standard model. However, this choice is constrained by searches for ′ → − W tb decays at the LHC collider at CERN, as well as by precision measurements of µ (ref. 42) and τ (ref. 43) decays.Another potentially interesting candidate would be a new type of Higgs boson, a particle of spin 0, similar to the recently discovered neutral Higgs , but electrically charged. This charged Higgs ( H − ) was pro-posed in minimal extensions of the standard model , which are part of broader theoretical frameworks such as supersymmetry . The H − would mediate weak decays, similar to the W − (as indicated in Fig. 1), but couple differently to leptons of different mass. The q and angular distributions would be affected by this kind of mediator because of its different spin.Another feasible solution might be leptoquarks , hypothetical parti-cles with both electric and colour (strong) charges that allow transitions from quarks to leptons and vice versa, and offer a unified description of three generations of quarks and leptons. Among the ten different types of leptoquarks, six could contribute to B → D ( * ) τν decays . A diagram of a spin-0 state mediating quark-lepton transitions is shown in Fig. 7 for the B decay modes under study.The BaBar and Belle collaborations have studied the implications of these hypothetical particles in the context of specific models . The measured values of R D and ∗ R D do not support the simplest of the two-Higgs doublet models (type II), however, more general Higgs models with appropriate parameter choices can accommodate these values . Some of the leptoquark models could also explain the measured values of R D and ∗ R D (refs 53–55), evading constraints from direct searches of lep-toquarks in ep collisions at HERA and pp collisions at LHC .The three-body kinematics of B → D ( * ) τν τ decays should permit further discrimination of new-physics scenarios based on the decay distribu-tions of final state particles. The q spectrum and the momentum –2 –1 0 105001,000 E v e n t s p e r . G e V m (GeV ) ad Data B → D (cid:87)(cid:81) B → D ∗ (cid:87)(cid:81) B → D (cid:81) B → D ∗ (cid:81) B → D ∗∗ ( / (cid:87) ) (cid:81) Background

BaBar 100200–2 0 2 4 6 8 10050100150 E v e n t s p e r . G e V m (GeV ) be BaBar 1002003000 0.5 1 1.5 2050100150 E v e n t s p e r . G e V (GeV) ∗ E cf m > 1 GeV m > 1 GeV BaBar E ∗ (cid:80) (GeV) E v e n t s p e r . G e V q > 9.35 GeV LHCb i m (GeV )–2 0 2 4 6 8 10 E v e n t s p e r . G e V q > 9.35 GeV LHCb h m (GeV )–2 –1 0 1 2 3 4 E v e n t s p e r . G e V –0.40 < q < 2.85 GeV g LHCb

Data B → D ∗ (cid:87)(cid:81) B → D ∗ H c ( → (cid:81) X ′ )X B → D ∗∗ (cid:81) B → D ∗ (cid:80)(cid:81) CombinatorialMisidenti fi ed (cid:80) Figure 5 | Extraction of the ratios R D and ∗ R D by maximum likelihood fits. Shown are comparisons of the projections of the measured m miss2 and ∗ E ℓ distributions (data points with statistical errors) and the fitted distributions of signal and background contributions (coloured areas; see keys in d and g ) for the fit by the BaBar collaboration to the D ℓ samples ( a – c ) and to the ∗ D ℓ samples ( d – f ), as well the fit by the LHCb collaboration to the ∗ + D ℓ sample ( g – i ). The D ℓ samples in a – c show sizeable contributions from ν → ∗ + − B D ℓ ℓ and τ ν → τ ∗ + − B D decays, because the low-energy pion or photon originating from a D * → D π or D * → D γ decay was undetected. The BaBar data exclude q < , where the contributions from signal decays is very small. The ∗ E ℓ distributions in c and f are signal enhanced by the restriction m miss2 > . The LHCb results are presented for two different q intervals: the lowest, which is free of τ ν → τ ∗ + − B D decays ( g ); and the highest, where this contribution is large ( h , i ). Panels a – f adapted from ref. 26, American Physical Society; panels g – i adapted from ref. 34, American Physical Society. Figure 6 | R D and ∗ R D measurements. Results from the BaBar , Belle and LHCb collaborations, showing their measured values and 1 σ contours. The average calculated by the Heavy Flavor Averaging Group (taking into account the statistical and systematic uncertainties and their correlations) is compared to standard model predictions . ST and HT refer to the measurements with semileptonic and hadronic tags, respectively. © 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. REVIEW RESEARCH

8 J U N E 2 0 1 7 | V O L 5 4 6 | N A T U R E | 2 3 1

A.3 Summed projections for all ﬁts

Projections summed over q bins. ) /c (GeV miss2 m-2 0 2 4 6 8 10 ) / c C a nd i d a t e s / ( . G e V µ E500 1000 1500 2000 2500 C a nd i d a t e s / ( M e V ) /c (GeV q0 2 4 6 8 10 12 ) / c C a nd i d a t e s / ( . G e V × LHCb

Data ντ D* → B X')X ν l → ( c D*H → B ν D**l → B ν µ D* → B Combinatorial µ Misidentified

Figure 5: Distributions of (left) m (center) E µ and (right) q for the signal sample with ﬁtprojections overlaid. ) /c (GeV miss2 m-2 0 2 4 6 8 10 ) / c C a nd i d a t e s / ( . G e V µ E500 1000 1500 2000 2500 C a nd i d a t e s / ( M e V ) /c (GeV q0 2 4 6 8 10 12 ) / c C a nd i d a t e s / ( . G e V Data ντ D* → B X')X ν l → ( c D*H → B ν D**l → B ν µ D* → B Combinatorial µ Misidentified

Figure 6: Distributions of (left) m (center) E µ and (right) q for the D ⇤ + µ ⇡ control samplewith ﬁt projections overlaid. E v e n t s / ( . G e V ) [GeV ] q [GeV ] m miss E v e n t s / ( . G e V ) E v e n t s / ( . G e V ) [GeV] E * ℓ × ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10100200300400 LHCb /c < 2.85 GeV − ) / c C a nd i d a t e s / ( . G e V ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10200400600800 LHCb /c < 6.10 GeV ) / c C a nd i d a t e s / ( . G e V ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 10200400600800 LHCb /c < 9.35 GeV ) / c C a nd i d a t e s / ( . G e V ) /c (GeV miss2 m -2 0 2 4 6 8 10 P u ll s -2 2 ) /c (GeV miss2 m-2 0 2 4 6 8 1050100150 LHCb /c < 12.60 GeV ) / c C a nd i d a t e s / ( . G e V * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 250020406080100 LHCb /c < 2.85 GeV − C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 2500100200300400500 LHCb /c < 6.10 GeV C a nd i d a t e s / ( M e V ) * (MeV) µ E

500 1000 1500 2000 2500 P u ll s -2 2 * (MeV) µ E500 1000 1500 2000 2500200400600800 LHCb /c < 9.35 GeV C a nd i d a t e s / ( M e V ) * (MeV) µ E

Figure 2: Results of ﬁtting control data enriched in B ! [ D , D ⇤ , D ] µ ⌫ µ (violet). The sampleis selected requiring exactly one track selected by the isolation MVA with opposite charge to the D ⇤ + candidate. Shown are projections in (left) m and (right) E ⇤ µ for each bin of q . Figure 15 Projections of the signal ﬁt for the LHCb measurement of R ( D ∗ + ) involving muonic τ decays (Aaij et al. , 2015c).Left: full q projection; Middle: m projection in the highest q bin; and Right: E ∗ (cid:96) projection in the highest q bin. variables, is estimated via the rest frame approximation,detailed in Sec. III.C.3.The templates for the combinatorial and misidentiﬁedmuon backgrounds are taken directly from the data con-trol samples described above, while the templates forthe B → D ∗ + H c X and B → D ∗∗ lν backgrounds arebased on Monte Carlo simulations with corrections ex-tracted from a ﬁt to the D ∗ + µ − K ± and D ∗ + µ − π − ( π + )samples. Figure 14 shows the excellent agreement be-tween the data and the resulting background model thatis achieved.The templates for the signal and normalization con-tributions are parameterized by CLN form factors(Sec. II.C.2) extracted from the ﬁt to the signal sam-ple. Figure 15 shows the ﬁt projection of the q variablein the full range, as well as the m and E ∗ (cid:96) projectionsin the q bin with the highest signal-to-background ratio.As Table XI shows, the limited size of the simulatedsamples is the main source of systematic uncertainty in Table XI Summary of the relative uncertainties for the LHCbmeasurement of R ( D ∗ + ) involving muonic τ decays (Aaij et al. , 2015c).Contribution Uncertainty [%]Sys. Stat.Simulated sample size 6.2Misidentiﬁed µ bkg. 4.8 B → D ∗∗ lν bkg. 2.1 B → D ∗ lν FFs 1.9Hardware trigger 1.8Double-charm bkg. 1.5MC/data correction 1.2Combinatorial bkg. 0.9Particle ID 0.9

Total systematic 8.9Total statistical 8.0Total 12.0

V.C Experimental Tests of Lepton Flavor Universality B → D ∗ lν templates. Theoverall systematic uncertainty is slightly larger than thestatistical uncertainty but, as discussed in Sec. V, manyof the systematic uncertainties are expected to decreasecommensurately with larger data samples. The result ofthis measurement is R ( D ∗ + ) = 0 . ± .

027 (stat) ± .

030 (syst) , (52)in good agreement with the previous measurements bythe B -factories. R ( D ∗ + ) with τ → π − π + π − ν Instead of a leptonic τ decay, the 2018 measurementof R ( D ∗ + ) by LHCb (Aaij et al. , 2018b) employed the3-prong τ − → π − π + π − ν τ decay. This channel is inter-esting a priori because it is presently the only τ decayfor which it is practical to reconstruct the τ decay ver-tex. This in turn provides good precision on the recon-struction of the B momentum as described in Sec. III.C.Moreover, when aggregated with the τ − → π − π + π − π ν τ channel, the 3-prong decays have a total branching frac-tion of 13.5%, comparable to that of the muonic decaychannel, and the pion-triplet dynamics provides very use-ful discrimination against the largest background contri-butions.In this measurement, signal B → D ∗ + τ − ν τ decaysare selected by requiring that the trajectories of a τ − lepton and an oppositely charged D ∗ + candidate, re-constructed exclusively via the decay chain D ∗ + → D ( → K − π + ) π + , are consistent with a common vertexseparated from the PV. The τ lepton is reconstructed byrequiring that the tracks of three pions with the appro-priate charges share a common vertex (Fig. 8 top). Sincethe ﬁnal state does not contain any charged lepton, fullyhadronic B → D ∗ + π − π + π − X decays initially dominatethe selected event sample. However, this backgroundcontribution may be reduced by four orders of magni-tude by taking advantage of the long τ lifetime: the πππ vertex in a signal decay is typically displaced downstreamof the B vertex. This allows one to distinguish such fromthe prompt topology of B → D ∗ + π − π + π − X decays, inwhich the πππ and the B vertices overlap, by requiringthat the distance between the τ and the B vertex po-sitions along the beam-axis is larger than four times itsreconstructed uncertainty (Fig. 16). Additionally, strictisolation from other charged particles is required to rejectcharm decays with more than three charged daughters,as well as fake detached vertices where the D ∗ meson andthe three pions come from other b -hadrons present in theevent.One of the major challenges in hadronic- τ measure-ments is that the normalization B → D ∗ + µ − ν µ de-cays are not measured simultaneously with the signal z D s z/ D - - C a nd i d a t e s / . ﬁ LHCb simulation ) X ppp * D Prompt ( ) DX * D Double-charm () nt * D Signal (

Figure 16 Distribution of the distance between the B vertexand the τ vertex along the beam direction (Fig. 8 top) dividedby its uncertainty in simulated events for the LHCb measure-ment of R ( D ∗ + ) involving τ → π − π + π − ν decays (Aaij et al. ,2018b). The vertical line shows the 4 σ requirement used inthe analysis to separate signal events in red from the promptbackground component in gray. B → D ∗ + τ − ν τ decays. Since absolute branching frac-tion measurements are exceedingly diﬃcult at LHCb, thisanalysis normalizes the signal yield against that of theprompt B → D ∗ + π − π + π − decay, which has the sameparticle content as the signal, and then relies on two ex-ternal branching fractions to calculate R ( D ∗ ) via R ( D ∗ ) = B (cid:0) ¯ B → D ∗ τ ν τ (cid:1) B (cid:0) ¯ B → D ∗ πππ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ﬁt × B (cid:0) ¯ B → D ∗ πππ (cid:1) B (cid:0) ¯ B → D ∗ µν µ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ext . (53)After selecting events with large τ ﬂight signiﬁcanceas described above, the dominant remaining backgroundcontributions consist of double-charm B → D ∗ + D ( ∗ , ∗∗ )( s ) decays. These decays were also the largest backgroundcontributions to the muonic- τ measurement of R ( D ∗ + ),but their relative amount in D and and D + s mesonsare very diﬀerent. Due to the large inclusive branch-ing fraction of the D + s meson to ﬁnal states with threepions (about 30%) and small rate to semileptonic ﬁnalstates, the double-charm background in the hadronic- τ sample contains ten times more D + s mesons than thatfor the muonic- τ sample. Interestingly, the D + s inclu-sive three-pion modes proceed mainly from two-body andquasi two-body decay channels involving η , η (cid:48) , ω , and φ mesons, which leads to very diﬀerent three-pion kine-matics with respect to those of the signal. That is, the τ → π − π + π − ν decay is well-described within resonancechiral theory (Ecker et al. , 1989a,b), featuring chiralterms as well as single-resonance ρ and double-resonance a → ρ contributions (Nugent et al. , 2013; Shekhovtsova et al. , 2012), leading to prominent ρ peaks in the dis-tribution of both the minimum and maximum massesto the two π + π − mass combinations—min( m π + π − ) andmax( m π + π − ), respectively.These kinematic diﬀerences are eﬀectively exploited by V.C Experimental Tests of Lepton Flavor Universality ] c )] [MeV/ - π + π ( m min[

200 300 400 500 600 700 800 900 1000 1100 1200 ) c C a nd i d a t e s / ( M e V / data background +s D Non- decays +s D Other ρη , π η → +s D ρ ' η , π ' η → +s D ] c )] [MeV/ - π + π ( m max[

200 400 600 800 1000 1200 1400 1600 1800 ) c C a nd i d a t e s / ( M e V / LHCb ] c ) [MeV/ + π + π ( m

200 400 600 800 1000 1200 1400 1600 1800 ) c C a nd i d a t e s / ( M e V / ] c ) [MeV/ + π - π + π ( m

600 800 1000 1200 1400 1600 1800 ) c C a nd i d a t e s / ( M e V / ] c )] [MeV/ - π + π ( m min[

200 400 600 800 1000 1200 1400 1600 1800 ) c C a nd i d a t e s / ( M e V / LHCb ] c ) [MeV/ + π + π ( m

200 400 600 800 1000 1200 1400 1600 1800 ) c C a nd i d a t e s / ( M e V / ] c ) [MeV/ + π - π + π ( m

600 800 1000 1200 1400 1600 1800 ) c C a nd i d a t e s / ( M e V / (c)(d) IV.C - Fit TO LHCb hadronic RD* control samples (a)(b) ] c ) [MeV/ + π − π + π − * D ( m ) c C a nd i d a t e s / ( M e V / DataTotal model + s D − * D → B +* s D − * D → B (2317) +*0 s D − * D → B (2460) +1 s D − * D → B X + s D **-,0 D → B X + s D − * D → s B Comb. bkg. ] c / [GeV q ) c / C a nd i d a t e s / ( . G e V LHCb (e)(f)

Figure 17 Control sample ﬁts for the LHCb measurement of R ( D ∗ + ) involving τ → π − π + π − ν decays (Aaij et al. , 2018b)employed to evaluate the composition of the various double-charm background contributions. (a-d) low-BDT sample and (e-f) B → D ∗ + D − s ( → π − π + π − ) X sample. a BDT that also includes other variables such as the en-ergy measured in the electromagnetic calorimeter in acone whose axis is deﬁned by the three-pion momentum.The kinematics of the three-pion system in background D and D + decays is more similar to that in signal de-cays because the inclusive πππ ﬁnal state from these twomesons is dominated by the Ka channel (Zyla et al. ,2020). Some discrimination is still possible, however, dueto the restricted phase space of this virtual a meson.Many of the B branching fractions to double-charmﬁnal states are known with poor precision or have notbeen measured yet. The following data control samplesare used to reduce the uncertainty due to the compositionof these background contributions: • A low-BDT sample enriched with inclusive D + s de-cays constrains the composition of B → D ∗ + D − s X decays. The simulation is reweighted to match a ﬁtto the min( m π + π − ), max( m π + π − ), m π + π − π + , and m π + π + distributions. These variables capture thecombined dynamics of the various inclusive D + s de-cay channels to three pions (Fig. 17 a-d). • A highly pure B → D ∗ + D − s ( → π − π + π − ) X sampleselected by imposing a requirement on m π + π − π + around the D + s mass. A template ﬁt to the m π + π − π + distribution is used to measure the rela-tive fractions of D + s mesons produced directly andfrom D ∗ s or D ∗∗ s decays. The shape of the D ∗ s broadpeak depends on the degree of longitudinal polar-ization of the D ∗ s and was adjusted in the simu-lation to reproduce the data. These measurements are important since the q distributions of these de-cays are very diﬀerent from each other, as shown inFig. 17 (f). • Clean B → D ∗ + D ( → K − π + π − π + ) X and B → D ∗ + D − ( → K − π + π − ) X samples selected by ex-plicitly reconstructing the D and D − mesons.These samples are used to monitor and understandthe non- D + s background composition.A three-dimensional binned maximum likelihood ﬁt to q , the BDT output, and the decay time of the recon-structed τ is performed to determine the signal and back-ground yields. The calculation of q relies on the B mo-mentum determination described in Sec. III.C.2. Thedecay time of the reconstructed τ , t τ , is computed fromits ﬂight distance and momentum obtained by the par-tial kinematic reconstruction. This variable is useful toseparate τ from D − decays, since the lifetime of the D − meson is 3 . τ lepton. Theﬁt results for the LHC Run 1 data sample, correspond-ing to a luminosity of 3 fb − , are displayed in Fig. 18.An interesting feature of this method compared to themuonic- τ measurement is that the highest BDT outputbin provides a fairly clean sample of signal decays with apurity of about 40%.As shown in Table XII, the uncertainties related tothe double-charm background and the limited size of thesimulated samples are the dominant systematic uncer-tainties in this measurement. The uncertainties due tothe limited knwoledge of external branching fractions inEq. (53), currently 4.6%, are worth mentioning because,unlike many of the other systematic uncertainties, these V.C Experimental Tests of Lepton Flavor Universality IV.C - Fit TO LHCb hadronic RD* signal sample [ps] τ t ] c / [GeV q DataTotal model τ ν + τ − * D → B τ ν + τ ** D → B (X) + s D − * D → B (X) + D − * D → B X π − * D → B (X) D − * D → B Comb. bkg (ns) τ ) c / (GeV q (ns) τ ) c / (GeV q (ns) τ C a nd i d a t e s / ( . s ) ) c / (GeV q ) c / C a nd i d a t e s / ( . G e V LHCb [ps] τ t ] c / [GeV q DataTotal model τ ν + τ − * D → B τ ν + τ ** D → B (X) + s D − * D → B (X) + D − * D → B X π − * D → B (X) D − * D → B Comb. bkg (ns) τ ) c / (GeV q (ns) τ ) c / (GeV q (ns) τ C a nd i d a t e s / ( . s ) ) c / (GeV q ) c / C a nd i d a t e s / ( . G e V LHCb

Figure 18 Projections of the signal ﬁt for the LHCb measure-ment of R ( D ∗ + ) involving τ → π − π + π − ν decays (Aaij et al. ,2018b). The four rows correspond to the four BDT bins forincreasing values of the BDT response.Table XII Summary of the relative uncertainties for theLHCb measurement of R ( D ∗ + ) involving τ → π − π + π − ν de-cays (Aaij et al. , 2018b).Contribution Uncertainty [%]Sys. Ext. Stat.Double-charm bkg. 5.4Simulated sample size 4.9Corrections to simulation 3.0 B → D ∗∗ lν bkg. 2.7Normalization yield 2.2Trigger 1.6PID 1.3Signal FFs 1.2Combinatorial bkg. 0.7Modeling of τ decay 0.4 Total systematic 9.1 B ( B → D ∗ πππ ) 3.9 B ( B → D ∗ (cid:96)ν ) 2.3 B ( τ + → πν ) / B ( τ + → ππ ν ) 0.7 Total external 4.6Total statistical 6.5Total 12.0 will not be reduced with the increasing LHCb data sam-ples that will be collected. Instead, additional measure-ments from Belle II will be needed (Sec. V.E).The result of this measurement was reported as R ( D ∗ + ) = 0 . ± . ± . ± .

013 in 2018. Tak-ing into account the latest HFLAV average of B ( B → D ∗ + (cid:96)ν ) = 5 . ± . ± . et al. , 2019), theresult is R ( D ∗ + ) = 0 . ± .

018 (stat) ± .

025 (syst) ± . , (54)where the third uncertainty is due to the external branch-ing fractions described above. R ( J/ψ ) with τ → µνν The ratio R ( J/ψ ) was measured for the ﬁrst time in2018 by the LHCb experiment (Aaij et al. , 2018a), thusopening the possibility for the exploration of LFUV indecays subject to very diﬀerent sources of both experi-mental and theoretical uncertainties compared to thosein R ( D ( ∗ ) ). This measurement leverages two of the keytechniques developed for the muonic R ( D ∗ + ) analysis de-scribed in Sec. IV.C.1: the isolation BDT and the restframe approximation. Just as for the R ( D ∗ + ) measure-ment, the τ lepton is reconstructed via τ → µνν , so thatsignal B c → J/ψ τ ν and normalization B c → J/ψ µν de-cays share the same ﬁnal state. The event is selected ifthe only additional tracks close to the muon coming fromthe τ decay are a pair of oppositely charged muons thatform a vertex separated from the PV and whose invariantmass is compatible with the J/ψ → µµ decay.The signal and normalization yields are extracted froma four-dimensional binned maximum likelihood ﬁt to q , m , E ∗ (cid:96) , and the proper time elapsed betweenthe production and decay of the B c meson: the decaytime. The ﬁrst three variables are calculated with thesame techniques as used in the muonic R ( D ∗ + ) analysis(Sec. IV.C.1). The inclusion of the decay time among theﬁt variables improves the separation of B c decays from B u,d,s decays, because the B c lifetime is almost threetimes shorter than that of B u,d,s mesons.A key diﬀerence with respect to the R ( D ( ∗ ) ) measure-ments is that background contributions from partiallyreconstructed B c decays are signiﬁcantly reduced thanksto the narrow invariant mass of the J/ψ meson and itsclean dimuon ﬁnal state. As a result of this reduction andthe overall small B c production rate, the main sourcesof background in the R ( J/ψ ) analysis are misidentiﬁed H b → J/ψ h + decays, where H b is a more abundant b -hadron and h + is a hadron incorrectly identiﬁed as amuon, as well as random combinations of muons.The template for the J/ψ h + contribution is estimatedby applying the the misidentiﬁcation probabilities for dif-ferent hadron species, as determined in high-purity sam-ples of identiﬁed hadrons, to a control sample with a V.D Experimental Tests of Lepton Flavor Universality the mis-ID background. A data-driven approach is used toconstruct templates for this background component. Asample of J / ψ h þ candidates, where h þ stands for a chargedhadron, is selected following similar criteria to those of thesignal sample but with the h þ failing the muon identi-fication criteria. This control sample is enriched in varioushadron species (primarily, pions, kaons, and protons) andelectrons. Using several high-purity control samples ofidentified hadrons, weights are computed that represent theprobability that a hadron with particular kinematic proper-ties would pass the muon criteria. These weights areapplied to the J / ψ h þ sample to generate binned templatesrepresenting these background components. The normali-zation of each of these components is allowed to vary in thefit to the data.A binned maximum likelihood fit is performed using thetemplates representing the various components. The num-ber of candidates from each component, with the exceptionof the combinatorial J / ψ background, are allowed to vary inthe fit, as are the shape parameters corresponding to the B þ c lifetime and the A ð q Þ form factor. The contributionsof the feed-down processes involving the decays ofhigher-mass charmonium states B þ c → ψ ð S Þ μ þ ν μ , B þ c → χ c ð ; ; Þ ð P Þ μ þ ν μ are allowed to vary in the fit, whereas theratio of the branching fractions R ½ ψ ð S Þ% ¼ B ½ B þ c → ψ ð S Þ τ þ ν τ % / B ½ B þ c → ψ ð S Þ μ þ ν μ % is fixed to the predictedSM value of 8.5% [18]. This is later varied for theevaluation of a systematic uncertainty.Extensive studies of the fit procedure are carried out toidentify potential sources of bias in the fit. Simulated signalis added to the data histograms, and the resulting changes inthe value of R ð J / ψ Þ from the fit are found to be consistentwith the injected signal increments. The procedure is alsoapplied to the mis-ID background, which shows no bias inthe fitted number of events as a function of injected events.Another important consideration for this measurement isthe disparate properties of the various templates. Sometemplates are populated in all kinematically allowedbins, such as the mis-ID background that is derived fromlarge data samples. Others are sparsely populated andcontain empty bins, e.g., for modes with low efficiencyand yields that are obtained from simulated events.Pseudoexperiments with template compositions similarto those in this analysis reveal a possible bias of the fitresults. Hence, the binning scheme for this analysis ischosen to minimize the number of empty bins in thesparsely populated templates, while retaining the discrimi-nating power of the distributions. Kernel density estimation(KDE) [36] is used to derive continuous distributionsrepresentative of the nominal fit templates. Simulatedpseudoexperiments using histogram templates sampledfrom these continuous distributions are then used toevaluate any remaining bias that results. Based on thesestudies, a Bayesian procedure is implemented for cor-recting the raw R ð J / ψ Þ value after unblinding. The results of the fit are presented in Fig. 1 showing theprojections of the nominal fit result onto the quantities m miss , decay time, and Z . The fit yields ’ signaland ’ normalization decays, where the errorsare statistical and correlated. Accounting for the τ þ → μ þ ν μ ¯ ν τ branching fraction and the ratio of efficiencies[ ð . ’ . Þ % ] gives an uncorrected value of 0.79 for R ð J / ψ Þ . Correcting for the mean expected bias at this ) / c C a nd i d a t e s / ( . G e V ] /c [GeV miss m P u ll s LHCb C a nd i d a t e s / ( . p s ) decay time [ps] P u ll s LHCb C a nd i d a t e s p e r b i n ) * µ ,E Z(q P u ll s LHCb

LHCb C a nd i d a t e s / ( . p s ) decay time [ps] P u ll s − Z=4

LHCb − ) / c C a nd i d a t e s / ( . G e V ] /c [GeV miss m − P u ll s − Z=5

LHCb C a nd i d a t e s / ( . p s ) decay time [ps] P u ll s − Z=5

LHCb − ) / c C a nd i d a t e s / ( . G e V ] /c [GeV miss m − P u ll s − Z=6

LHCb C a nd i d a t e s / ( . p s ) decay time [ps] P u ll s − Z=6

LHCb − ) / c C a nd i d a t e s / ( . G e V ] /c [GeV miss m − P u ll s − Z=7

LHCb C a nd i d a t e s / ( . p s ) decay time [ps] P u ll s − Z=7

LHCb

Figure 2: Projections of the nominal ﬁt in bins 4–7 of Z , i.e. individual bins of q and E ⇤ µ . IV.C - Fit of LHCb muonic R(J/ Ψ ) E * ℓ q E v e n t s / ( . G e V ) E v e n t s / ( . p s ) decay time [ps] the mis-ID background. A data-driven approach is used toconstruct templates for this background component. Asample of J / ψ h þ candidates, where h þ stands for a chargedhadron, is selected following similar criteria to those of thesignal sample but with the h þ failing the muon identi-fication criteria. This control sample is enriched in varioushadron species (primarily, pions, kaons, and protons) andelectrons. Using several high-purity control samples ofidentified hadrons, weights are computed that represent theprobability that a hadron with particular kinematic proper-ties would pass the muon criteria. These weights areapplied to the J / ψ h þ sample to generate binned templatesrepresenting these background components. The normali-zation of each of these components is allowed to vary in thefit to the data.A binned maximum likelihood fit is performed using thetemplates representing the various components. The num-ber of candidates from each component, with the exceptionof the combinatorial J / ψ background, are allowed to vary inthe fit, as are the shape parameters corresponding to the B þ c lifetime and the A ð q Þ form factor. The contributionsof the feed-down processes involving the decays ofhigher-mass charmonium states B þ c → ψ ð S Þ μ þ ν μ , B þ c → χ c ð ; ; Þ ð P Þ μ þ ν μ are allowed to vary in the fit, whereas theratio of the branching fractions R ½ ψ ð S Þ% ¼ B ½ B þ c → ψ ð S Þ τ þ ν τ % / B ½ B þ c → ψ ð S Þ μ þ ν μ % is fixed to the predictedSM value of 8.5% [18]. This is later varied for theevaluation of a systematic uncertainty.Extensive studies of the fit procedure are carried out toidentify potential sources of bias in the fit. Simulated signalis added to the data histograms, and the resulting changes inthe value of R ð J / ψ Þ from the fit are found to be consistentwith the injected signal increments. The procedure is alsoapplied to the mis-ID background, which shows no bias inthe fitted number of events as a function of injected events.Another important consideration for this measurement isthe disparate properties of the various templates. Sometemplates are populated in all kinematically allowedbins, such as the mis-ID background that is derived fromlarge data samples. Others are sparsely populated andcontain empty bins, e.g., for modes with low efficiencyand yields that are obtained from simulated events.Pseudoexperiments with template compositions similarto those in this analysis reveal a possible bias of the fitresults. Hence, the binning scheme for this analysis ischosen to minimize the number of empty bins in thesparsely populated templates, while retaining the discrimi-nating power of the distributions. Kernel density estimation(KDE) [36] is used to derive continuous distributionsrepresentative of the nominal fit templates. Simulatedpseudoexperiments using histogram templates sampledfrom these continuous distributions are then used toevaluate any remaining bias that results. Based on thesestudies, a Bayesian procedure is implemented for cor-recting the raw R ð J / ψ Þ value after unblinding. The results of the fit are presented in Fig. 1 showing theprojections of the nominal fit result onto the quantities m miss , decay time, and Z . The fit yields ’ signaland ’ normalization decays, where the errorsare statistical and correlated. Accounting for the τ þ → μ þ ν μ ¯ ν τ branching fraction and the ratio of efficiencies[ ð . ’ . Þ % ] gives an uncorrected value of 0.79 for R ð J / ψ Þ . Correcting for the mean expected bias at this ) / c C a nd i d a t e s / ( . G e V ] /c [GeV miss m P u ll s LHCb C a nd i d a t e s / ( . p s ) decay time [ps] P u ll s LHCb C a nd i d a t e s p e r b i n ) * µ ,E Z(q P u ll s LHCb

Data µ + µ J/ +c B Mis-ID bkg. comb. bkg. µ +J/ comb. bkg. J/ +c H J/ +c B l+ l(1P) c+c B l+ l(2S) +c B + J/ +c B FIG. 1. Distributions of (top) m miss , (middle) decay time, and(bottom) Z of the signal data overlaid with projections of the fitmodel with all normalization and shape parameters at their best-fit values. Below each panel, differences between the data and fitare shown, normalized by the Poisson uncertainty in the data; thedashed lines are at the values ’ . PHYSICAL REVIEW LETTERS J / ψ h þ candidates, where h þ stands for a chargedhadron, is selected following similar criteria to those of thesignal sample but with the h þ failing the muon identi-fication criteria. This control sample is enriched in varioushadron species (primarily, pions, kaons, and protons) andelectrons. Using several high-purity control samples ofidentified hadrons, weights are computed that represent theprobability that a hadron with particular kinematic proper-ties would pass the muon criteria. These weights areapplied to the J / ψ h þ sample to generate binned templatesrepresenting these background components. The normali-zation of each of these components is allowed to vary in thefit to the data.A binned maximum likelihood fit is performed using thetemplates representing the various components. The num-ber of candidates from each component, with the exceptionof the combinatorial J / ψ background, are allowed to vary inthe fit, as are the shape parameters corresponding to the B þ c lifetime and the A ð q Þ form factor. The contributionsof the feed-down processes involving the decays ofhigher-mass charmonium states B þ c → ψ ð S Þ μ þ ν μ , B þ c → χ c ð ; ; Þ ð P Þ μ þ ν μ are allowed to vary in the fit, whereas theratio of the branching fractions R ½ ψ ð S Þ% ¼ B ½ B þ c → ψ ð S Þ τ þ ν τ % / B ½ B þ c → ψ ð S Þ μ þ ν μ % is fixed to the predictedSM value of 8.5% [18]. This is later varied for theevaluation of a systematic uncertainty.Extensive studies of the fit procedure are carried out toidentify potential sources of bias in the fit. Simulated signalis added to the data histograms, and the resulting changes inthe value of R ð J / ψ Þ from the fit are found to be consistentwith the injected signal increments. The procedure is alsoapplied to the mis-ID background, which shows no bias inthe fitted number of events as a function of injected events.Another important consideration for this measurement isthe disparate properties of the various templates. Sometemplates are populated in all kinematically allowedbins, such as the mis-ID background that is derived fromlarge data samples. Others are sparsely populated andcontain empty bins, e.g., for modes with low efficiencyand yields that are obtained from simulated events.Pseudoexperiments with template compositions similarto those in this analysis reveal a possible bias of the fitresults. Hence, the binning scheme for this analysis ischosen to minimize the number of empty bins in thesparsely populated templates, while retaining the discrimi-nating power of the distributions. Kernel density estimation(KDE) [36] is used to derive continuous distributionsrepresentative of the nominal fit templates. Simulatedpseudoexperiments using histogram templates sampledfrom these continuous distributions are then used toevaluate any remaining bias that results. Based on thesestudies, a Bayesian procedure is implemented for cor-recting the raw R ð J / ψ Þ value after unblinding. The results of the fit are presented in Fig. 1 showing theprojections of the nominal fit result onto the quantities m miss , decay time, and Z . The fit yields ’ signaland ’ normalization decays, where the errorsare statistical and correlated. Accounting for the τ þ → μ þ ν μ ¯ ν τ branching fraction and the ratio of efficiencies[ ð . ’ . Þ % ] gives an uncorrected value of 0.79 for R ð J / ψ Þ . Correcting for the mean expected bias at this ) / c C a nd i d a t e s / ( . G e V ] /c [GeV miss m P u ll s LHCb C a nd i d a t e s / ( . p s ) decay time [ps] P u ll s LHCb C a nd i d a t e s p e r b i n ) * µ ,E Z(q P u ll s LHCb

Figure 19 Projections of the signal ﬁt for the LHCb muonic measurement of R ( J/ψ ) (Aaij et al. , 2018a). Left: Full m projection; Middle: m projection in the highest q and lowest E ∗ (cid:96) bins; and Right: decay time projection in the highest q and lowest E ∗ (cid:96) bins.Table XIII Summary of the relative uncertainties for theLHCb muonic measurement of R ( J/ψ ) (Aaij et al. , 2018a).Contribution Uncertainty [%]Sys. Stat.Signal/norm. FFs 17.0Simulated sample size 11.3Fit model 11.2Misidentiﬁed µ bkg. 7.9Partial B c bkg. 6.9Combinatorial bkg. 6.5 (cid:15) sig /(cid:15) norm Total systematic 25.4Total statistical 23.9Tota l J/ψ and an additional track that fails the muon iden-tiﬁcation. This template is treated as free-ﬂoating inthe signal ﬁt. The combinatorial backgrounds are es-timated in the sidebands of the B c mass and the J/ψ masses, m ( J/ψ µ ) > . < m ( µ + µ − ) < B − c → ψ (2 S ) (cid:96) − ν (cid:96) and B − c → χ c (1 P ) (cid:96) − ν (cid:96) are extracted from the ﬁt with templates taken from MCsimulation.Figure 19 shows the ﬁt projections for m over thefull range, as well as m and the B c decay time in the E ∗ (cid:96) and q ranges with the highest signal-to-backgroundratio. The agreement is good overall and a small but sig-niﬁcant signal contribution at high m and low decaytimes can be observed.Table XIII summarizes the sources of uncertainty inthis measurement. The leading contribution comes fromthe B c → J/ψ lν decay form factors, which have not beenmeasured yet and had to be determined in the signal ﬁtitself. As discussed in Sec. II.E, HQET cannot be used todescribe a decay with a heavy spectator quark, so that atthe time of publication of this measurement only quarkmodel predictions, untested by experiment, were avail-able. The recent results of lattice calculations will reduce this uncertainty substantially. Sizeable uncertainties alsoarise due to the limited size of the simulated samples andthe ﬁt model. These are also expected to be reduced infuture measurements.The result of this measurement is R ( J/ψ ) = 0 . ± .

17 (stat) ± .

18 (syst) , (55)which lies within 2 standard deviations of the SM pre-diction in Eq. (34). D. Belle polarization measurements τ polarization with τ → πν and τ → ρν The Belle experiment measured in (Hirose et al. , 2017,2018) the τ polarization fraction P τ ( D ∗ ) introduced inSec. II.D.2. The analysis strategy is similar to thatof the hadronic tag measurements of B → D ∗ τ ν de-cays (Huschle et al. , 2015; Lees et al. , 2012, 2013), but re-constructs the τ lepton in the hadronic one-prong τ → πν and τ → ρν modes. For these ﬁnal states, the helicityangle cos θ h can be explicitly reconstructed by taking ad-vantage of the fully reconstructed tag-side B meson toboost the visible τ daughter particles into the rest frameof the τ ν lepton system with the 4-momentum q = p e + e − − p B tag − p D ∗ . (56)The terms on the right hand side are the momenta of thecolliding e + e − pair, the reconstructed tag-side B meson,and the reconstructed D ∗ candidate, respectively. In thelepton system frame, the τ energy and momentum mag-nitude are fully determined by q and the τ lepton mass m τ , E τ = q + m τ (cid:112) q , | (cid:126)p τ | = q − m τ (cid:112) q . (57)In this frame, the cosine of the angle between the spatialmomenta of the τ lepton and its daughter meson, h , iscos θ τh = 2 E τ E h − m τ − m h | (cid:126)p τ || (cid:126)p h | , (58) V.D Experimental Tests of Lepton Flavor Universality (GeV) ECL

E0 0.2 0.4 0.6 0.8 1 1.2 1.4 E v en t s / ( . G e V ) hel - E v en t s / ( . ) θ hel Figure 20 Signal ﬁt for the measurement of the τ polarizationfraction P τ ( D ∗ ) by Belle (Hirose et al. , 2017). The ﬁts to theneutral and charged B candidates as well as the τ → πν and τ → ρν decay modes and the two cos θ h bins are all combinedtogether. in which E h and | (cid:126)p h | are the daughter meson energy andabsolute spatial momentum, respectively. By applyinga boost into the τ rest frame, one can then express thecosine of the helicity angle ascos θ h = 1 | (cid:126)p τh | (cid:0) γ | (cid:126)p h | cos θ τh − γβE h (cid:1) . (59)Here, γ = E τ /m τ , β = | (cid:126)p τ | /E τ , and | (cid:126)p τh | =( m τ − m h ) / (2 m τ ) denotes the absolute daughter mesonspatial momentum in the τ rest frame.To reduce backgrounds, only candidates with q > and with a physical value of cos θ h ∈ [ − , E ECL and only candidates with E ECL < . B tag reconstruction, whose eﬃciencylikely diﬀers between data and simulation, the measuredsignal event yields are normalized to B → D ∗ (cid:96)ν events.These can be identiﬁed and separated from backgroundprocesses using m (cf. Sec. III.C). For both signaland normalization candidates, events with additionalcharged tracks or π candidates are rejected.The observables R ( D ∗ ) and P τ ( D ∗ ) are extracted froma ﬁt to the E ECL distribution in two bins of cos θ h : [ − , , τ decay samples, τ → πν and τ → ρν . The free param-eters in the ﬁt include the yields for the B → D ∗ τ ν , B → D ∗ (cid:96)ν , B → D ∗∗ lν , continuum, and fake D ∗ contri-butions, among others. Figure 20 shows the ﬁtted E ECL distribution for all the reconstructed modes combined to-gether. The ﬁtted signal yields are then converted intomeasurements of R ( D ∗ ) and P τ ( D ∗ ) with R ( D ∗ ) = 1 B ( τ → hν ) × (cid:15) norm (cid:15) sig × N sig N norm , (60) R(D*) ( D * ) t P - - - - cD Figure 21 The values of R ( D ∗ ) and P τ ( D ∗ ) (white star)and the 1 σ , 2 σ , and 3 σ contours as measured by Belle (Hi-rose et al. , 2017). The SM expectations (Amhis et al. , 2019;Tanaka and Watanabe, 2013) are shown by the white triangle.The gray band shows the (then) world average measurementof R ( D ∗ ). P τ ( D ∗ ) = 2 α N cos θ h > − N cos θ h < N cos θ h > + N cos θ h < , (61)with α being a factor that accounts for the sensitivity onthe polarization and eﬃciency diﬀerences of both chan-nels. The obtained values are R ( D ∗ ) = 0 . ± . +0 . − . (syst) , (62) P τ ( D ∗ ) = − . ± . +0 . − . (syst) , (63)with a total correlation including systematic uncertain-ties of ρ = 0 .

33. These results are in good agreementwith the SM expectations, as shown in Fig. 21. A sum-mary of the uncertainties on these measurements can befound in Table XIV. The largest systematic uncertain-ties stem from the composition of the hadronic B mesonbackground and the limited size of the simulated samplesused to determine the ﬁt PDFs. D ∗ polarization with inclusive tagging The Belle experiment reported in (Abdesselam et al. ,2019) a ﬁrst, preliminary, measurement of the longitu-dinal D ∗ polarization fraction F L,l ( D ∗ ) (see Sec. II.D.2)based on inclusively tagged events (Sec. III.C.1). First,a viable B → D ∗− τ + ν τ signal candidate with τ → (cid:96)νν or τ → πν and D ∗− → D π − is reconstructed. The D meson is reconstructed in D → K + π − , D → K + π − π ,and D → K + π + π − π − modes. Thereafter, no explicitreconstruction is attempted of the other (tag) B mesonproduced in the e + e − collision. Instead, an inclusivereconstruction approach that sums over all unassignedcharged particles and neutral energy depositions above acertain energy threshold in the calorimeter is employed. . Common Systematic Uncertainties and Future Prospects Table XIV Summary of the relative uncertainties for Belle’shadronic tag measurement of R ( D ∗ ) and P τ ( D ∗ ) (Hirose et al. , 2017, 2018).Result Contribution Uncertainty [%]sys. stat. R ( D ∗ ) B → D ∗∗ (cid:96) ¯ ν (cid:96) (cid:15) sig /(cid:15) norm Total systematic 9.9Total statistical 12.9Total 16.3 P τ ( D ∗ ) PDF modeling 33Other bkg. 31 Total systematic 48Total statistical 134Total 143

Compared to hadronic or semileptonic tagging, this ap-proach has the beneﬁt of a higher reconstructions eﬃ-ciency, as it does not rely on identifying decay cascadescorrectly, but results in a poorer B momentum resolu-tion.The tag side is required to be compatible with a well-reconstructed B meson by requiring M tag = (cid:113) ( E − | p tag | ) > . , (64)and − . < E tag − E beam < .

05 GeV, where E beam = √ s/ e + e − beamsin the CM frame.The sizeable background contributions are suppressedwith the signal-side normalized variable X miss = E miss − | p D ∗ + p d τ | (cid:113) E − m B , (65)where E miss = E beam − ( E D ∗ + E d τ ) and d τ refers tothe visible τ daughter. Events with one neutrino havevalues of X miss in the range [ − , X miss be larger than 1.5 or 1 for the τ → (cid:96)νν and τ → πν decay modes, respectively.The helicity angle θ v is deﬁned as the angle betweenthe reconstructed D and the direction opposite to the B meson in the D ∗− frame (see deﬁnition in Fig. 1; theBelle analysis uses the notation θ hel ). Because of the low D ∗ reconstruction eﬃciency for cos θ v >

0, the analy-sis focuses on the − ≤ cos θ v ≤ θ v from ﬁts tothe M tag distribution, see Fig. 22 for an example. Mostbackgrounds do not peak in this variable, with the ex-ception of semileptonic decays into light leptons. The [GeV] tag M E v en t s / ( . ) [GeV] tag M E v en t s / ( . ) [GeV] tag M E v en t s / ( . ) [GeV] tag M E v en t s / ( . ) [GeV] tag M E v en t s / ( . ) [GeV] tag M E v en t s / ( . ) [GeV] tag M E v en t s / ( . ) [GeV] tag M E v en t s / ( . ) [GeV] tag M E v en t s / ( . ) FIG. 1. Fit projections to M tag distributions in three bins of cos ✓ hel for ⌧ ! ⇡⌫ ⌧ (sequentialcolumns) and D ! K⇡ (top), D ! K⇡⇡ (middle), D ! K ⇡ (bottom). The solid lines showthe result of the ﬁt. Contributions of the signal, combinatorial and peaking backgrounds arerepresented by the red (dot-dashed), blue (dashed) and green (dotted) lines, respectively. To estimate the uncertainties of combinatorial background from hadronic B decays, wevary within ±

50% the relative fractions of 2-body, 3-body and n-body (n >

3) hadronicchannels. Two-body decays of the type B ! ¯ D ⇤ M , where M denotes a meson with a mass M M > B sig and B tag decays, represent the mainpeaking background in the ⌧ ! ⇡⌫ ⌧ mode. The systematic uncertainty coming from thecomposition of the M states (mainly the c ¯ s resonances) is evaluated by reweighting the q spectrum by ± q : q < . and q > . . (At q ⇡ . Figure 22 Signal ﬁt to the lowest cos θ v bin, [ − , − . D → K + π − π channel for the measurement of the longi-tudinal D ∗ polarization fraction by Belle (Abdesselam et al. ,2019). The red curve corresponds to the signal contribution,and the blue and green curves display the non-resonant andresonant background contributions, respectively. yields for these peaking contributions are determined inthe side bands of kinematic variables. The D ∗ polariza-tion fraction is determined by a ﬁt to the signal yieldsas a function of cos θ v . Given the size of the cos θ v bins,resolution eﬀects are assumed to be negligible. Figure 23shows the measured helicity angle distribution, correctedfor acceptance eﬀects. The resulting ﬁtted value for thelongitudinal D ∗ polarization fraction is F L,τ ( D ∗ ) = 0 . ± . ± . , (66)with its uncertainty dominated by the limited size ofthe data sample. The largest systematic uncertainty inthis measurement stems from the signal and non-resonantbackground shapes used in the M tag ﬁts, followed by theuncertainty on the modeling of B → D ∗∗ τ ν decays.This result agrees with the SM prediction of F L,τ ( D ∗ ) SM = 0 . . σ level. An important control measurement is the D ∗ polarization of the light-lepton states, F L,(cid:96) ( D ∗ ) =0 . ± .

02, which is in agreement with the prediction of F L,(cid:96) ( D ∗ ) BLPRSM = 0 . V. COMMON SYSTEMATIC UNCERTAINTIES ANDFUTURE PROSPECTS

The diﬀerent measurements of R ( D ( ∗ ) ) so far are fairlyindependent of each other because their uncertainties aredominated by the limited size of the data and the sim-ulation samples. However, over the next decade andhalf, Belle II and LHCb will collect data samples 50 to200 times larger than those used for the present mea-surements of R ( D ( ∗ ) ) (Table III), so the relative impact .A Common Systematic Uncertainties and Future Prospects IV.C - D* polarization cos θ v E v e n t s / ( . ) Figure 23 Measured cos θ v distribution in B → D ∗− τ + ν τ decays for the determination of the longitudinal D ∗ polar-ization fraction by Belle, adapted from (Abdesselam et al. ,2019). The red curve shows the best ﬁt of the longitudinalpolarization fraction and the yellow band corresponds to theSM expectation (Huang et al. , 2018). of other systematic uncertainties will increase. Some ofthese uncertainties are due to aspects of the experimentalanalysis that are shared among all measurements and cantherefore lead to common systematic uncertainties. As aresult, the combination of the measurements will entaila more complex treatment of these uncertainties. Ta-ble XV and the following subsections describe the mainsources of systematic uncertainty in the measurement of R ( D ( ∗ ) ), and the level of commonality among the variousapproaches. We also discuss in the following subsections the futureprospects to reduce the total uncertainty in R ( D ( ∗ ) ), aswell as on LFUV ratios in many other decay modes, downto a few percent or less. In particular, reducing the sys-tematic uncertainties commensurately with the statisti-cal uncertainties will require meeting key challenges incomputation, the modeling of b -hadron semileptonic de-cays, and background estimation in the years to come. A. Monte Carlo simulation samples

Table XV shows that one of the principal sources of un-certainty in the R ( D ( ∗ ) ) measurements arises from the It is worth noting that, while some uncertainties are multiplica-tive , i.e. they scale with the resulting central value (e.g., uncer-tainties on the signal eﬃciency), the majority of the uncertaintyis additive (e.g., uncertainties associated to the background sub-traction or signal shapes). As a result, changes in the centralvalues would alter the value of the uncertainty when expressedin percentage. However, given that the overall uncertainty hasbecome smaller than 20% and the central values are startingto converge (see Fig. 25), the presentation of uncertainties inpercentages should give a broadly accurate representation of theuncertainties and allow for comparisons across diﬀerent measure-ments. limited size of the simulation samples. This limitationresults in large uncertainties through two diﬀerent, butparallel, considerations: First, B → D ( ∗ ) lν decays havesome of the largest B branching fractions, necessitatingvery large simulation samples to acceptably model thedata. Such uncertainties, however, are statistical in na-ture and thus independent among diﬀerent experimentalanalyses.Second, semitauonic decays involve ﬁnal states withmultiple neutrinos, which escape detection. As a result,the reconstructed kinematic distributions employed toseparate signal from background events are broad anddiﬃcult to describe analytically. Instead, experimentsrely upon Monte Carlo simulation to derive the templatesthat are used in the signal extraction ﬁt. Because of thebroad nature of these distributions, multiple dimensionsare necessary to disentangle the various contributions,which results in the simulated events being widely dis-tributed among the numerous bins in the templates.Of course, Monte Carlo-based uncertainties can be re-duced simply by producing more simulated events. How-ever, given the size of future data samples, it will be botha time and cost challenge to continue producing simu-lated events in suﬃcient numbers such that these uncer-tainties remain controlled. Thus, diﬀerent solutions willneed to be considered. At present the most promisingapproaches are: (i) Hardware : The High Energy Physics (HEP) com-munity has historically relied upon the exponential in-crease in computing throughput for relatively stable in-vestments. As this exponential growth slows, eithergreater funding will have to be found or new avenueswill need to be explored to keep up. Monte Carlo sim-ulations are highly parallelizable, which makes them afavorable target for graphics processing unit (GPU) com-putation. Eﬀorts to make increasing use of GPUs are un-derway, and expertise and appropriate tools will have tobe further developed by the HEP community to ensurethe widespread adoption of GPUs and reap their beneﬁts. (ii)

Fast simulation (FastSim) : the most resource in-tensive step in the generation of simulated events is thesimulation of the detector response. Several procedureshave been developed and are in use already that accel-erate this step by simulating only parts of the detectors,or parameterizing its response. New machine learningtechniques such as generative adversarial networks maybe able to further optimize this aspect of event simula-tion. See e.g. (Erdmann et al. , 2019; Vallecorsa, 2018)for proof-of-concept studies of this. (iii)

Aggressive generator-level selections : these canhelp reduce the number of events that need to be fullysimulated. Fiducial selections are already widely ap-plied, but as data becomes abundant and the comput-ing resources are stretched thin, analyses may have tostart focusing on reduced regions of phase space withan even better signal-to-noise ratio. The generator-level .C Common Systematic Uncertainties and Future Prospects Table XV Summary of the uncertainties on the R ( D ( ∗ ) ) measurements. The “Other bkg.” column includes primarily con-tributions from DD and combinatorial backgrounds. The “Other sources” column is dominated by particle identiﬁcation andexternal branching fraction uncertainties. Systematic uncertainty [%] Total uncert. [%]

Result Experiment τ decay Tag MC stats D ( ∗ ) lν D ∗∗ lν Other bkg. Other sources

Syst. Stat. Total R ( D ) BABAR a (cid:96)νν Had. 5.7 2.5 5.8 3.9 0.9

Belle b (cid:96)νν Semil. 4.4 0.7 0.8 1.7 3.4

Belle c (cid:96)νν Had. 4.4 3.3 4.4 0.7 0.5 R ( D ∗ ) BABAR a (cid:96)νν Had. 2.8 1.0 3.7 2.3 0.9

Belle b (cid:96)νν Semil. 2.3 0.3 1.4 0.5 4.7

Belle c (cid:96)νν Had. 3.6 1.3 3.4 0.7 0.5

Belle d πν , ρν Had. 3.5 2.3 2.4 8.1 2.9

LHCb e πππ ( π ) ν — 4.9 4.0 2.7 5.4 4.8 LHCb f µνν — 6.3 2.2 2.1 5.1 2.0 a (Lees et al. , 2012, 2013) b (Caria et al. , 2020) c (Huschle et al. , 2015) d (Hirose et al. , 2018) e (Aaij et al. , 2015c) f (Aaij et al. , 2018b) selections would then have to be adjusted as closely aspossible to these reduced areas to maximize the physicsoutput of the simulation. For Belle II an attractive op-tion to increase the size of simulated samples in analysesthat use hadronic tagging would be to only generate thelow branching fraction modes actually targeted by thetagging algorithms. See e.g. (Kahn, 2019) for a proof-of-concept implementation using generative adversarialnetworks.It is important to note that each of these approachesalone will not be suﬃcient to cover all future needs. Forinstance, the FastSim implementations currently beingemployed at LHCb allow for simulated events to be pro-duced with about ten times fewer resources than withfull simulation. However, this order of magnitude im-provement only covers the increased needs from Run 1(3.1 fb − ) to Run 2 (6 fb − , twice the bb cross section,and higher eﬃcienty than in Run 1). Meeting the needsfor the 50 ab − that will be collected by Belle II, or the300 fb − by LHCb, will probably involve the combineduse of the approaches listed above and perhaps others. B. Modeling of B → D ( ∗ ) lν As discussed at length in Sec. II, the predominant the-ory uncertainties in the modeling of b → cτ ν decays arisesin the description of their hadronic matrix elements. Pre-cision parametrizations of these matrix elements are cur-rently achieved by either data-driven model-independentapproaches, such as ﬁts to HQET-based parametrizations(Sec. II.C.2), or by lattice QCD results (Sec. II.C.4),or a combination of both. This applies to predictionsboth for the ground states as well as the excited states(Sec. II.E) that often dominate background contribu-tions. In the case of B → D ( ∗ ) (cid:96)ν , these approaches haveled to form factors determinations whose uncertainties only contribute at the 1–2% level in the measurements of R ( D ( ∗ ) ).Especially for semitauonic analyses using the electronicor muonic τ decay channels, a reliable description of B → D ( ∗ ) (cid:96)ν semileptonic decays is a critical input, inorder to control lepton cross-feed backgrounds. Thehadronic τ decay analyses also rely on these light semilep-tonic inputs, but to a lesser extent. Finally, there is someadditional uncertainty in the modeling of the detectorresolution for the kinematic variables that these analysesdepend upon, that can be shared across results from thesame experiment. C. B → D ∗∗ (cid:96)ν and B → D ∗∗ τ ν backgrounds

1. Systematic uncertainties evaluation and control

Excited D ∗∗ states decay to D ∗ , D , or D ± mesonsplus additional photons or pions, which can escape detec-tion. As a result, both B → D ∗∗ (cid:96)ν and B → D ∗∗ τ ν de-cays can easily lead to extraneous candidates in R ( D ( ∗ ) )analyses, though the former contributes only to measure-ments that employ the leptonic decays of the τ lepton.In hadronic analyses, the corresponding background isformed by B → D ∗∗ D ( ∗ , ∗∗ ) s decays. While all analysesexploit dedicated D ∗∗ control samples where some of theparameters describing these contributions are measured,a number of assumptions are shared among the variousmeasurements, namely the form factor parameterizationof the B → D ∗∗ lν decays (Sec. II.E) and the D ∗∗ decaybranching fractions.First data-driven ﬁts of the B → D ∗∗ form factors havebeen performed (Bernlochner and Ligeti, 2017; Bern-lochner et al. , 2018a), but the resulting parameters—especially for the broad states—are not yet well con-strained. The chosen approach is, however, improvable .C Common Systematic Uncertainties and Future Prospects B → D ( ∗ ) modes, data-driven predictions for B → D ∗∗ (Eq. (29)) are thus likelyto improve in precision until they reach the naive or-der of 1 /m c contributions—i.e. a few percent—beyondwhich the number of nuisance parameters becomes large.Combination with future LQCD results (see e.g. (Bailas et al. , 2019)), however, may permit even more precisepredictions. Additionally, the R ( D ∗∗ ) ratios have not yetbeen measured, so the various experiments have relied ontheoretical predictions, assigning a relatively large uncer-tainty. The size of this uncertainty is however arbitraryand could lead to a common underestimate of the system-atic uncertainty from the D ∗∗ feed-down (cf. Sec. V.C.2).With the latest theoretical predictions (Eq. (29)), thisuncertainty should be reduced in the future.Dedicated experimental eﬀorts are also presently on-going to further address these issues. In particular: (i) Improved measurements are anticipated for the B → D ∗∗ (cid:96)ν relative branching fractions and kine-matic distributions such as the four-momentum transfersquared or further angular relations. This is especiallyimportant for the broad D (cid:48) and D ∗ states, that are stillpoorly known compared to the narrow D and D ∗ states.Such measurements can in principle already be carriedout with currently available data sets. (ii) Measurements involving a hadronized W → D + s ,i.e. B → ( D ∗∗ → D ( ∗ ) π ) D + s (Aaij et al. , 2020a; LHCbCollaboration, 2020). This approach oﬀers much bettersensitivity to decays involving the wide D ∗∗ states be-cause the D ( ∗ ) π spectrum can be cleanly measured viathe sideband subtraction on the narrow B mass peak.Additionally, the presence of a D + s meson in the ﬁnalstate oﬀers two unique features: (a) in contrast to decayswhere the virtual W produces a single pion, the q rangefor production of a D + s meson is in the range of interestfor semitauonic decays; and (b) the relatives rates of thevarious D ∗∗ states can be measured when associated toboth spin-0 ( D s ) and spin-1 ( D ∗ s ) states. (iii) The direct measurement of B → D ∗∗ τ ν decaysfor the narrow states D ∗∗ = D or D ∗ . When combinedwith the estimated branching fractions for the narrow D ∗∗ versus the total D ∗∗ rate, and expectations fromisospin symmetry (the feed-down is dominated by D ∗∗± states while much better experimental precision will beachieved for D ∗∗ ), these B → D ∗∗ τ ν results might beused to control the D ∗∗ feed-down rate into the R ( D ( ∗ ) )signal regions.Signiﬁcant progress can therefore be expected in thecontrol of this important common systematic uncertaintyin the near term, such that the systematic uncertaintydue to B → D ∗∗ lν decays is likely to be reduced to thepercent level or less. Table XVI Estimates for D ∗∗ strong decay branching frac-tions to exclusive two body decays, and the sum of non- D ∗ -resonant three body decays, (cid:80) Dππ . Based on the ap-proach of (Bernlochner and Ligeti, 2017) and measurementsfrom (Aaij et al. , 2011; Zyla et al. , 2020).Parent Final State D ∗ π + D ∗ π Dπ + Dπ (cid:80) DππD ∗ .

26 0 .

13 0 .

40 0 .

20 — D .

42 0 .

21 — — 0 . D (cid:48) .

67 0 .

33 — — — D — — 0 .

67 0 .

33 — D ∗∗ branching fraction assumptions in R ( D ( ∗ ) ) analyses While the estimation of the normalization of the con-tributions from background B → D ∗∗ lν decays is largelydata-driven, a number of assumptions in the variousbranching fractions involved can have a signiﬁcant im-pact in the measurement of R ( D ( ∗ ) ). These are: (i) B ( D ∗∗ → D ( ∗ ) π ( π )): These branching fractionsare primary inputs to all the B → D ∗∗ lν templates em-ployed in the signal extraction ﬁts. Using the approachof (Bernlochner and Ligeti, 2017), B ( D ∗∗ → D ( ∗ ) π ) canbe estimated by combining data for the ratios B ( D ∗∗ → D ∗ π ) / B ( D ∗∗ → Dπ ) (Zyla et al. , 2020), isospin rela-tions, and measurements of ratios of non- D ∗ -resonantthree body D and D ∗ decays to D π + π − versus twobody decays to D ∗ + π − (Aaij et al. , 2011). The latterare used to estimate the total non- D ∗ -resonant branchingfraction to all possible Dππ ﬁnal states with an isospincorrection factor (cid:39)

2. The resulting estimates for exclu-sive two body decays, and sum of non- D ∗ -resonant threebody decays, are shown in Tab. XVI. The experimentalanalyses, however, have used various other sets of diﬀer-ent numbers, which is worth being revisited. (ii) B ( B → D ∗∗ (cid:96)ν ): As mentioned above, thehadronic- τ measurements are not sensitive to this con-tribution. The leptonic- τ analyses have some sensitiv-ity to these branching fractions, but it is small becausethe total contribution from B → D ∗∗ (cid:96)ν decays for thefour D ∗∗ states is ﬂoated in the various ﬁts. Since thefour contributions are combined together in the same ﬁttemplate, the relative B → D ∗∗ (cid:96)ν branching fractions—typically taken from (Zyla et al. , 2020)—impact the mea-sured R ( D ( ∗ ) ) values at the 0.3–0.8% level (Lees et al. ,2013). (iii) B ( B → D ∗∗ τ ν ): All R ( D ( ∗ ) ) measurements arerather sensitive to this contribution because the kine-matics of the ﬁnal state particles in these decays aresimilar to those in signal decays. Some leptonic- τ mea-surements tie this contribution to the ﬁtted B → D ∗∗ (cid:96)ν yields via R ( D ∗∗ ) or merge it with other background con-tributions. The B A B AR analysis (Lees et al. , 2013) as-sumes R ( D ∗∗ ) = 0 .

18 for all D ∗∗ states. Investigation .F Common Systematic Uncertainties and Future Prospects et al. , 2020; Huschle et al. , 2015) suggeststhey assumed an average of R ( D ∗∗ ) = 0 .

15, while theLHCb result (Aaij et al. , 2015c) uses R ( D ∗∗ ) = 0 . τ R ( D ∗ ) measurement from LHCb (Aaij et al. , 2018b) ties the B → D ∗∗ τ ν yield to be 11% of theﬁtted B → D ∗ τ ν yield, and further decreases the valueof R ( D ∗ ) by 3% to take into account an additional con-tribution from B s → D (cid:48) s τ ν decays. Notably, all theseassumed values for R ( D ∗∗ ) are signiﬁcantly above thepredicted central values (Eq. (29)), by about 50%. Theimpact on the measured values can be estimated fromthe R ( D ∗∗ ) systematic uncertainty estimated in (Lees et al. , 2013). A 50% downwards variation of the as-sumed R ( D ∗∗ ) = 0 .

18 value results in R ( D ( ∗ ) ) increasingby 1 . . R ( D ( ∗ ) ) world averagewith the SM predictions by more than 0 . σ . For futuremeasurements, we therefore advocate that experimentsrevisit their assumptions regarding the D ∗∗ feed-down inlight of available data-driven predictions. D. Modeling other signal modes

Some insight into the precision of future form factorpredictions, and their role in LFUV analyses, can beobtained from considering the case of B c → J/ψτ ν .As can be seen in Tab. XIII, a dominant systematicuncertainty—17%—in the 2018 LHCb analysis (Aaij et al. , 2018a) arose from the poorly-known description ofthe B c → J/ψ form factors. At the time, the predictionfor R ( J/ψ ) was known only at the 10% level, or worse.However, very recent LQCD results for the B c → J/ψ form factors (34) now permit percent level predictions,such that one might expect the corresponding systematicuncertainty to similarly drop by an order of magnitudein a future analysis.With regard to Λ b → Λ ( ∗ ) c decays, while the groundstate form factors are known to high precision already, acombination of anticipated LQCD results and future datamay similarly permit the excited state form factors to beconstrained at or beyond the 1 /m c level. Finally, futureLQCD studies may be expected to improve predictionsfor B s → D ( ∗ , ∗∗ ) s form-factors to a comparable level asfor B → D ( ∗ , ∗∗ ) , well beyond the ∼

20% uncertaintiesfrom ﬂavor symmetry arguments.

E. Other background contributions

Double charm decays of the form B → D ( ∗ , ∗∗ ) D ( ∗ , ∗∗ ) s and B → D ( ∗ , ∗∗ ) D ( ∗ , ∗∗ ) K ( ∗ ) can lead to ﬁnal statetopologies very similar to those of semitauonic pro-cesses, whenever the decay of one of the charm mesonsmimics that of a τ lepton. Examples are D ( ∗ , ∗∗ ) s → Xτ ν, Xπ + π − π + or D ( ∗ , ∗∗ ) → X(cid:96)ν with X referring tounreconstructed particles. Such processes are very sig-niﬁcant background modes for R ( D ( ∗ ) ) measurements atLHCb, and to a somewhat lesser extent, for B -factorymeasurements. While several of these analyses estimatethe overall double-charm contribution using data controlsamples, all measurements rely on averages of previouslymeasured branching fractions of B and D decays fromthe Particle Data Group compilation (Zyla et al. , 2020).These averages are used as an input to produce the rightmixture of decay modes for background templates. Ad-ditionally, the extrapolations into the signal regions oftenrely on simulations whose models for the decay dynam-ics might not reﬂect the full resonance structure of suchtransitions. This set of assumptions can be common toseveral experiments.Although a wealth of branching fraction determina-tions regarding these and other relevant decays have beenaccumulated by BESIII, BaBar, Belle, and LHCb, thereare signiﬁcant areas where measurements that are in prin-ciple feasible have not been carried out or are not preciseenough to provide useful constraints. Instances of theseare double charm decays with excited kaons in the ﬁnalstate or hadronic and double charm processes involving D ∗∗ states. These are especially important because theycover the high q range that has the highest signal purityin R ( D ( ∗ ) ) measurements. In the near future, Belle IIand LHCb will provide new results of branching frac-tions for such decays that will alleviate the reliance oncommon assumptions for the various double-charm decaymodes. Additionally, more precise information about thesemileptonic and π + π − π + decays of charm mesons willbe needed, which can be provided by BESIII in the nearfuture. F. Other systematic uncertainties

The remaining uncertainties in Table XV are domi-nated by particle identiﬁcation and external branchingfraction uncertainties. The latter are especially rele-vant for measurements that utilize the hadronic decaysof the τ lepton. The ﬁnal state for the signal decays inthese measurements does not correspond to that of the B → D ( ∗ ) (cid:96)ν decays needed for the R ( D ( ∗ ) ) denomina-tor and, as a result, intermediate normalization modesare employed. For instance, the current precision onthe normalization decays for the τ → π − π + π − ν analysisfrom LHCb (Aaij et al. , 2018b), B → D ∗ π + π − π + and B → D ∗ µν as shown in Eq. (53), is limited to 3–4%, sonew measurements of these branching fractions are neces-sary to reduce the overall uncertainty beyond that level.In fact, what is required is the ratio of these two quan-tities, which can be measured more precisely than eachbranching ratio separately, a measurement that Belle IImay be able to perform relatively easily. I.A Combination and Interpretation of the Results

VI. COMBINATION AND INTERPRETATION OF THERESULTS

The semitauonic measurements described in Sec. IVexhibit various levels of disagreement with the SM pre-dictions. In this section, we further examine these resultsand explore these tensions. To brieﬂy resummarize, atthe time of the publication of this review, the followingrecent measurements were available (see also Table V):1. In B → D ( ∗ ) τ ν decays(a) Six measurements of R ( D ∗ ) and three of R ( D ). For convenience we resummarize herethese results in Table XVII.(b) One measurement of the τ polarization frac-tion, P τ ( D ∗ ) = − . ± . +0 . − . .(c) One measurement of the D ∗ longitudinal po-larization fraction, F L,τ ( D ∗ ) = 0 . ± . ± . q distributions shown in Fig. 11.2. One measurement of a b → cτ ν transition using B c decays, R ( J/ψ ) = 0 . ± . ± . b → uτ ν transition, R ( π ) =1 . ± . R ( D ( ∗ ) )in terms of the light-lepton normalization modes, theisospin-conjugated modes, and their measured values asa function of time. Thereafter we revisit in Sec. VI.B thecombination of the measured R ( D ( ∗ ) ) values. In partic-ular, we discuss the role of non-trivial correlation eﬀectson such averages and point out that with more precisemeasurements on the horizon these eﬀects will need to berevisited. In Sec. VI.C we discuss the saturation of themeasured inclusive rate by exclusive contributions as im-plied by the current world averages of R ( D ∗ ) and R ( D )together with the expected B → D ∗∗ τ ν rates. Finally,Secs. VI.D and VI.E discuss the challenges in develop-ing self-consistent new physics interpretations of the ob-served tensions with the SM and possible connections tothe present-day FCNC anomalies, respectively. A. Dissection of R ( D ( ∗ ) ) results and SM tensions The current status of LFUV measurements versus SMpredictions, and the signiﬁcance of their respective ten-

Table XVII Summary of R ( D ( ∗ ) ) measurements and worldaverages. The hadronic- τ LHCb result (Aaij et al. , 2018b)has been updated taking into account the latest HFLAV av-erage of B ( B → D ∗ + (cid:96)ν ) = 5 . ± . ± . ρ D ∗∗ )” are calculated by proﬁling the unknown B → D ∗∗ lν correlation and obtaining ˆ ρ D ∗∗ = − .

88 as de-scribed in Sec. VI.B.

Experiment τ decay Tag R ( D ) R ( D ∗ ) ρ tot BABAR a µνν Had. 0 . . − . b µνν Semil. 0 . . − . c µνν Had. 0 . . − . d πν , ρν Had. 0 . (+28)( − –LHCb e πππ ( π ) ν – – 0 . f µνν – – 0 . Avg. (ˆ ρ D ∗∗ ) 0 . . − . HFLAV Avg. g . . − . a (Lees et al. , 2012, 2013) b (Caria et al. , 2020) c (Huschle et al. , 2015) d (Hirose et al. , 2018) e (Aaij et al. , 2018b) f (Aaij et al. , 2015c) g (Amhis et al. , 2019) Table XVIII Current status of LFUV measurements (seeSec. IV) versus SM predictions in Sec. II, and their respectiveagreements or tensions. For P τ ( D ∗ ) and F L,τ ( D ∗ ) we showa na¨ıve arithmetic average of the SM predictions (Tab. II)as done for R ( D ( ∗ ) ). For R ( D ( ∗ ) ) we show the world averagefrom the HFLAV combination (Amhis et al. , 2019); below theline we show for comparison the results of the R ( D ( ∗ ) ) worldaverage obtained in this work (see Sec. VI.B).Obs. CurrentWorld Av./Data CurrentSM Prediction Signiﬁcance R ( D ) 0 . ± .

030 0 . ± .

003 1 . σ (cid:41) . σ R ( D ∗ ) 0 . ± .

014 0 . ± .

005 2 . σP τ ( D ∗ ) − . ± . +0 . − . − . ± .

011 0 . σF L,τ ( D ∗ ) 0 . ± . ± .

04 0 . ± .

006 1 . σ R ( J/ψ ) 0 . ± . ± .

18 0 . ± . . σ R ( π ) 1 . ± .

51 0 . ± .

016 0 . σ R ( D ) . ± . . ± .

003 1 . σ (cid:41) . σ R ( D ∗ ) . ± . . ± .

005 2 . σ sions or agreements, is summarized in Tab. XVIII, in-cluding the current HFLAV combination of the R ( D ( ∗ ) )data. For the SM predictions the arithmetic averagesdiscussed in Section II are quoted. The individual ten-sions of all LFUV measurements with the SM expecta-tions range from 0 . . σ . The combined value of R ( D )and R ( D ∗ ) is in tension with the SM expectation by 3 . σ because of their anti-correlation. Also note that the valueof P τ ( D ∗ ) is slightly correlated with both averages.A subset of the existing measurements provide values I.B Combination and Interpretation of the Results Table XIX Results of the isospin-unconstrained ﬁts for the

BABAR analysis (Lees et al. , 2012, 2013). The ﬁrst uncer-tainty is statistical and the second systematic.Result

BABAR R ( D ) 0 . ± . ± . R ( D + ) 0 . ± . ± . R ( D ∗ ) 0 . ± . ± . R ( D ∗ + ) 0 . ± . ± . of R ( D ( ∗ ) ) normalized to either electron or muon ﬁnalstates. These results present an important check becausethe values reported for the semitauonic ratios are typi-cally an average for the electron and muon normaliza-tions, assuming R ( D ( ∗ ) ) = R ( D ( ∗ ) ) e = R ( D ( ∗ ) ) µ (67)with R ( D ( ∗ ) ) e ≡ B (cid:0) B → D ( ∗ ) τ − ν τ (cid:1) B (cid:0) B → D ( ∗ ) e − ν e (cid:1) , (68) R ( D ( ∗ ) ) µ ≡ B (cid:0) B → D ( ∗ ) τ − ν τ (cid:1) B (cid:0) B → D ( ∗ ) µ − ν µ (cid:1) . (69)LHCb only measures R ( D ( ∗ ) ) µ , but the B -factories haveaccess to the electron normalization as well. Figure 24compares R ( D ( ∗ ) ) e and R ( D ( ∗ ) ) µ and no systematic de-viation between both ratios is observed. It should benoted that these results were released as stability checksthat compare the compatibility of the electron and muonchannels, not as optimized measurements of R ( D ( ∗ ) ) e/µ .For instance, (Franco Sevilla, 2012) does not includethe full systematic uncertainties and correlation for theelectron and muon R ( D ( ∗ ) ), so the values from the full R ( D ( ∗ ) ) results are used in Fig. 24, increasing the cor-relation to account for the larger statistical uncertaintyof the R ( D ( ∗ ) ) e and R ( D ( ∗ ) ) µ results. Additionally, thedouble ratio R ( D ( ∗ ) ) light = R ( D ( ∗ ) ) µ R ( D ( ∗ ) ) e = B (cid:0) B → D ∗ e − ν e (cid:1) B (cid:0) B → D ∗ µ − ν µ (cid:1) , (70)that would be obtained from dividing these results, wouldhave unnecessarily large uncertainties because the com-mon B (cid:0) B → D ( ∗ ) τ ν (cid:1) factor is obtained with τ → eνν decays in the case of R ( D ( ∗ ) ) e and τ → µνν decays for R ( D ( ∗ ) ) µ . A high precision measurement of R ( D ( ∗ ) ) light was released recently by the Belle collaboration (Waheed et al. , 2019) R ( D ( ∗ ) ) light = 1 . ± . ± .

03 (71)and is compatible with unity.Table XIX shows the results of the isospin-unconstrained ﬁts of the B A B AR R ( D ( ∗ ) ) analysis, ex-hibiting good compatibility between charged and neutral R(D) ) * R ( D , 101802 (2012) BABAR, PRL BABAR electronBABAR muonHFLAV average Spring 2019 , 161803 (2020)

Belle, PRL Belle electronBelle muonSM predictions

Figure 24 Measurements of R ( D ( ∗ ) ), R ( D ( ∗ ) ) e , and R ( D ( ∗ ) ) µ from BABAR (Franco Sevilla, 2012) andBelle (Caria, 2019). R ( D ) Submission date0.30.40.5 R ( D ∗ ) Figure 25 Measurements of R ( D ( ∗ ) ) as a function of papersubmission time. Green refers to BABAR , dark blue to Belle,light blue to LHCb, and violet to the SM predictions. Circularmarkers refer to hadronic tagging, triangles to semileptonictagging, rombuses to inclusive tagging, and squares to un-tagged measurements. Filled markers refer to measurementsusing muonic decays of the τ lepton while hollow to hadronicdecays. D and D ∗ modes. Such measurements might be partic-ularly interesting in the context of obtaining data-driveninsight into the size of semiclassical radiative corrections,expected to enter at the subpercent level.Another interesting comparison is to examine the mea-surements of R ( D ( ∗ ) ) as a function of time: more preciseknowledge of normalization and background processes of-ten leads to shifts. Figure 25 displays the measured valueas a function of paper submission time. Notably the mostprecise measurements were produced recently and showoverall a better agreement with the SM expectation. B. Revisiting R ( D ( ∗ ) ) world averages To further investigate the tension of the measured val-ues of R ( D ( ∗ ) ) with the SM, we examine and updatetheir averages. We note that the systematic uncertain-ties of all measurements have signiﬁcant correlations (seeSec. V) that need to be taken into account properly. I.B Combination and Interpretation of the Results .

50 0 .

75 1 .

00 1 .

25 1 .

50 1 . R ( D ) / R ( D ) SM . . . . . R ( D ∗ ) / R ( D ∗ ) S M ρ D ∗∗ = 0 ρ D ∗∗ = 1 ρ D ∗∗ = − ρ D ∗∗ = − .

88 0 .

50 0 .

75 1 .

00 1 .

25 1 .

50 1 . R ( D ) / R ( D ) SM . . . . . R ( D ∗ ) / R ( D ∗ ) S M Belle 2015BaBar 2012Belle 2020Average ( ˆ ρ D ∗∗ ) Belle 2017LHCb 2015LHCb 2018 Figure 26 Left: R ( D ( ∗ ) ) world averages with diﬀerent assumptions for the unknown correlation ρ D ∗∗ : The average with ρ D ∗∗ = 0 (light blue) is based on similar assumptions as (Amhis et al. , 2019) and shows a compatibility with the SM expectationof 3.2 standard deviations taking into account the small uncertainties of the theoretical predictions; ρ D ∗∗ = ± . . ρ D ∗∗ = − .

88 (heather gray) with a compatibility with the SM of 3.6 standard deviations.Right: Our world average of R ( D ) and R ( D ∗ ) (black curves), compared to the various measurements of R ( D ( ∗ ) ). The unknowncorrelation ρ D ∗∗ is treated as a free, but constrained, parameter of the average (see main text for more details). The most important ones stem from the modeling of the B → D ∗∗ lν processes, which comprise a signiﬁcant back-ground source in all measurements to date. The mannerin which the uncertainties of these background contribu-tions are estimated varies considerably. As discussed inSec. V.C.1, the normalization or shape uncertainties fromthe hadronic form factors are, in some measurements, val-idated or constrained by control regions. Thus, a simplecorrelation model will not be able to properly quantifysuch correlations.One particularly important point here is the treatmentof the correlations of these systematics between R ( D ∗ )and R ( D ) measurements. In individual measurementsthat measure both quantities simultaneously, this treat-ment is straightforward. However, it becomes unclearhow to relate systematic uncertainties between e.g. R ( D )and R ( D ∗ ) in two separate measurements. To provide aconcrete example, consider the B A B AR measurement of R ( D ) (in the context of the combined R ( D ( ∗ ) ) determi-nation of (Lees et al. , 2012, 2013)) and the Belle mea-surement of R ( D ∗ ) (in the combined R ( D ( ∗ ) ) analysis of(Huschle et al. , 2015)). In the individual measurements,the systematic uncertainty associated with B → D ∗∗ (cid:96) ¯ ν (cid:96) is 45% and −

15% correlated between R ( D ) and R ( D ∗ ),respectively. From this information alone it is impossibleto derive the correct correlation structure between R ( D )and R ( D ∗ ) across measurements. We further investigate the dependence of the world av-erage on the B → D ∗∗ (cid:96) ¯ ν (cid:96) correlation structure across R ( D ) and R ( D ∗ ) measurements by parametrizing themwith a single factor ρ D ∗∗ . In Fig. 26 (left) we show theworld average assuming such correlation eﬀects are neg-ligible (labeled as ρ D ∗∗ = 0) and we reproduce a worldaverage very similar to HFLAV (Amhis et al. , 2019). Thenumerical values, normalized to the arithmetic average ofthe SM predictions (cf. Tab. I in Sec. II.D.1), are R ( D ) / R ( D ) SM = 1 . ± . , (72) R ( D ∗ ) / R ( D ∗ ) SM = 1 . ± . , (73)with an overall correlation of ρ = − .

33. In addition tothe B → D ∗∗ (cid:96) ¯ ν (cid:96) uncertainties, the uncertainties in theleptonic τ branching fractions and the B → D ( ∗ ) lν FFsare fully correlated across measurements. The compat-ibility with the SM expectation is within 3 . et al. ,2019) of 3 . σ ). Figure 26 (left) also shows the impactof setting this unknown correlation to either ρ D ∗∗ = 1or ρ D ∗∗ = −

1, resulting in compatibilities with the SMpredictions of 2 . . ρ D ∗∗ in this type of problem is outlinedin (Cowan, 2019). Instead of neglecting the value, wecan incorporate it as a free parameter of the problem I.D Combination and Interpretation of the Results − ,

1] is to assign it a double Fermi Dirac distribution with a large shape parameter, e.g. w = 50. Carrying outour average with such a setup results in R ( D ) / R ( D ) SM = 1 . ± . , (74) R ( D ∗ ) / R ( D ∗ ) SM = 1 . ± . , (75)with ˆ ρ D ∗∗ = − .

88 and an overall correlation of ρ = − .

40. This results in an increased tension of about 3 . σ with respect to the SM.Although neither of these world averages are based oncompletely correct assumptions, they illustrate the needfor future R ( D ( ∗ ) ) measurements to provide more de-tailed breakdowns of their uncertainties. It is intriguingthat introducing an additional correlation structure ofa systematic uncertainty can shift the agreement withthe SM expectation over a range of 0 . ρ D ∗∗ )”—and the HFLAVaverage (Amhis et al. , 2019); see also Tab XVIII. Weshow this world average for R ( D ( ∗ ) ) compared to thevarious measurements in Fig 26 (right). C. Inclusive versus exclusive saturation

The SM prediction for the semitauonic inclusivebranching ratio is B ( B → X c τ ν ) = 2 . × − , (76)obtained by combining the SM prediction in Eq. (38) withthe data for the ﬂavor-averaged light lepton branching ra-tio B ( B → X c (cid:96)ν ) (Zyla et al. , 2020). The saturation ofthe inclusive SM prediction by the sum of SM predic-tions for the exclusive decay modes can be explored bycomparing the inclusive branching ratio to that for thesum of D ( ∗ ) and D ∗∗ . For simplicity, in the followingwe treat the uncertainties for each mode as independent.Using the HFLAV-averaged SM prediction for R ( D ( ∗ ) )(Table I) together with the average branching ratio for B ( B → D ( ∗ ) (cid:96)ν ) and B ( B − → D ( ∗ ) (cid:96)ν ), one ﬁnds B ( B → Dτ ν ) = 0 . × − , (77a) B ( B → D ∗ τ ν ) = 1 . × − , (77b)and similarly one may use the combined D ∗∗ SM pre-diction in Eq. (30) with world averages for B ( B − → D ∗∗ (cid:96)ν ) (Bernlochner and Ligeti, 2017), yielding (cid:88) X c ∈ D ∗∗ B ( B → X c τ ν ) = 0 . × − . (78) f ( x, w ) = 1 / (2(1 + exp( w ( x − − w ( x − ℬ[𝐵 → 𝑋 𝑐 𝜏𝜈] (%) Incl. SM Pred.

𝐷 + 𝐷 ∗ + 𝐷 ∗∗ SM Pred.

LEP 𝑏 → 𝑋𝜏𝜈

𝐷 + 𝐷 ∗ HFLAV Av.

Incl. Belle (Unpublished)

Figure 27 Saturation of the inclusive SM prediction (redband) for B ( B → X c τ ν ) by the sum of measured exclusivebranching fractions implied by the R ( D ) and R ( D ( ∗ ) ) worldaverages (blue square). By comparison, the SM prediction forthe sum of B → D ( ∗ , ∗∗ ) τ ν exclusive branching fractions (blueband), is compatible with, and does not saturate, the inclusiveprediction. Also shown is: (i) the measured inclusive branch-ing fraction measurements for b → Xτ ν from LEP (Zyla et al. ,2020) (open red square), normalized against total number oftagged bb events. Assuming hadronization eﬀects cancel, itcan be interpreted as B ( B → Xτ ν ); and (ii) the unpublishedinclusive measurement of (Hasenbusch, 2018) using Belle data(red ﬁlled square), that shows a large excess.

Adding these contributions, one obtains the SM predic-tion (cid:80) X c ∈ D ( ∗ , ∗∗ ) B ( B → X c τ ν ) = 2 . × − , whichis compatible with, and does not saturate, the inclusiveSM prediction in Eq. (76), as shown in Fig. 27.One can characterize the degree of LFUV in the semi-tauonic system by comparing the inclusive SM predictionwith the sum of measured branching ratios for B ( B → D ( ∗ ) τ ν ). In this case the SM prediction in Eq. (76) arisesfrom theory inputs, and features theory uncertainties,that are independent of the inputs used for predictions of R ( D ( ∗ ) ) (see Sec. II.G). Figure 27 compares the inclusiveSM prediction with the sum of the B → D ( ∗ ) τ ν branch-ing fractions arising from the R ( D ( ∗ ) ) world averages, aswell as with the measured inclusive b → Xτ ν branchingfraction from LEP (Zyla et al. , 2020), and the result for B → Xτ ν from the PhD thesis (Hasenbusch, 2018) us-ing Belle data. One sees that the R ( D ( ∗ ) ) world averagesalready imply near-saturation of the inclusive SM predic-tion, while the unpublished result from the Belle data ismore than 3 σ in tension with it. D. New Physics interpretations

1. Parametrization of SM tensions

The measured lepton universality ratios R ( D ( ∗ ) )naively express tensions with respect to SM predictionsin terms of the overall decay rates or branching ratios. As I.D Combination and Interpretation of the Results - - - - SR c0.90.920.940.960.9811.021.041.061.081.1 S M e / e Model: 2HDM nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: 2HDM - - - - - T = 8 c SL c0.90.920.940.960.9811.021.041.061.081.1 S M e / e Model: R2 nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: R2 - - - - - T = -8 c SL c0.90.920.940.960.9811.021.041.061.081.1 S M e / e Model: S1 nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: S1 - - - - - T c0.90.920.940.960.9811.021.041.061.081.1 S M e / e Model: Tensor nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: Tensor - - - - SR c0.511.522.533.54 S M ) ( * ) ) / R ( D ( * ) R ( D Model: 2HDM nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: 2HDM - - - - - T = 8 c SL c0.511.522.533.54 S M ) ( * ) ) / R ( D ( * ) R ( D Model: R2 nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: R2 - - - - - T = -8 c SL c0.511.522.533.54 S M ) ( * ) ) / R ( D ( * ) R ( D Model: S1 nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: S1 - - - - - T c0.511.522.533.54 S M ) ( * ) ) / R ( D ( * ) R ( D Model: Tensor nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: Tensor

Figure 28 Top: Typical variation of experimental acceptances for the 2HDM, the leptoquark models R and S , and a puretensor current, normalized with respect to the SM acceptance ε SM , for B → Dτ ν (blue) and B → ( D ∗ → Dπ ) τ ν (red), with τ → eνν . The dotted, solid and dashed lines show the resulting acceptances for q resolutions (see text) of 0 .

8, 1 . . ,respectively. Bottom: Variation in R ( D ( ∗ ) ) / R ( D ( ∗ ) ) SM for the same models. such, typically many phenomenological interpretations ofthese results simply require that any New Physics (NP)accounts for the measured ratios (or other observablessuch as polarization fractions) within quoted uncertain-ties. However, this naive approach may lead to biases inNP interpretations.The reason for this is that in practice, as discussed inSec. IV, the R ( D ( ∗ ) ) ratios are recovered from ﬁts in mul-tiple reconstructed observables. In these ﬁts, the signal B → D ( ∗ ) τ ν decay distributions (as well as backgrounds)are assumed to have SM shapes—their reconstructed ob-servables are assumed to have an SM template —whiletheir normalization is allowed to ﬂoat independently. Inthe SM, the ratio of R ( D ) / R ( D ∗ ) is itself tightly pre-dicted up to small form factor uncertainties. Thus, thecurrent experimental approach can be thought of intro-ducing a NP ﬁt template , that is parametrized by varia-tion in the double ratio R ( D ) / R ( D ∗ ) as well as, say, theoverall size of R ( D ∗ ).Variation of R ( D ∗ ), while keeping R ( D ) / R ( D ∗ ) ﬁxedto its SM prediction, is consistent with NP contribu-tions from the c VL Wilson coeﬃcient. This Wilson coef-ﬁcient by deﬁnition still generates SM-like distributions:so that incorporating c VL contributions is self-consistentwith the ﬁt template assumptions from which the mea-sured R ( D ( ∗ ) ) values were recovered.However, to explain the variation in R ( D ) / R ( D ∗ )from the SM prediction requires further NP contribu-tions, that generically also alter the B → D ( ∗ ) τ ν sig- nal (and some background) decay distributions and ac-ceptances. (It is possible that there exist NP contri-butions which only modify the neutrino distributions.Because the experiments marginalize over missing en-ergy, this particular NP could permit R ( D ) / R ( D ∗ ) tosimultanteously ﬂoat from the SM prediction while pre-serving the SM template for reconstructed observables.)These NP contributions are thus generically inconsis-tent with the assumed SM template in the current mea-surement and ﬁt, and may aﬀect the recovered valuesof R ( D ( ∗ ) ) themselves. As a result, while the currentworld-average for R ( D )– R ( D ∗ ) unambiguously indicatesa tension with the SM, it does not a priori allow for aself-consistent NP interpretation or explanation. A self-consistent BSM measurement of any recovered observ-able instead requires e.g. dedicated ﬁt templates for eachBSM point of interest, which we discuss further below.A similar tension with the SM can be established whenadditional observables such as asymmetries, longitudinalfractions, or polarization fractions are compared to SMpredictions (see Sec. II.D.2), and there is much litera-ture studying their in-principle NP discrimination power.However, the same caveat with regard to NP interpreta-tions applies: NP contributions may alter the recoveredvalues of these parameters. I.E Combination and Interpretation of the Results - - - Model: 2HDM - - - - SR c2 - ] [ G e V m i ss m nt D ﬁ B Model: 2HDM - - - Model: R2 - - - - - T = 8 c SL c2 - ] [ G e V m i ss m nt D ﬁ B Model: R2 - - - Model: S1 - - - - - T = -8 c SL c2 - ] [ G e V m i ss m nt D ﬁ B Model: S1 - - - Model: Tensor - - - - - T c2 - ] [ G e V m i ss m nt D ﬁ B Model: Tensor - [GeV miss2 m00.050.10.150.20.25 Model: 2HDM nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: 2HDM - [GeV miss2 m00.050.10.150.20.25 Model: R2 nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: R2 - [GeV miss2 m00.050.10.150.20.25 Model: S1 nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: S1 - [GeV miss2 m00.050.10.150.20.25 Model: Tensor nt D ﬁ B nt D* ﬁ B nt D ﬁ B nt D* ﬁ B Model: Tensor

Figure 29 Top: Color map of the percent variation per bin in the reconstructed m normalized distribution for B → Dτ ν ,comparing the SM to a range of couplings for the 2HDM, the leptoquark models R and S , and a pure tensor current.Variations for B → D ∗ τ ν are similar but somewhat smaller, ranging up to the 1-2% level. Bottom: Example normalized m distributions for the SM (solid) versus NP (dashed) for B → Dτ ν (blue) and B → D ∗ τ ν (red). The chosen NP coupling foreach model is shown by a dashed line in the corresponding top row ﬁgures.

2. Sensitivity and biases in recovered observables

To gain a sense of the size of these eﬀects, we consideran approximate mock-up of an e + e − experimental envi-ronment and examine the variation in acceptances, ε , for B → Dτ ν and B → ( D ∗ → Dπ ) τ ν , with τ → eνν inthe presence of NP. In this mock-up, the beam energiesare ﬁxed to 7 and 4 GeV, and we require visible ﬁnalstate particles to fall within an angular acceptance 20 ◦ –150 ◦ . We impose a minimum electron energy threshold of E e >

300 MeV, and an approximate turn-on eﬃciency isincluded to account for the slow pion reconstruction eﬃ-ciencies in D ∗ → Dπ decays. We further include a Gaus-sian smearing added to the truth level q with a widthof 1.2 GeV , in order to account for detector resolutionand tag- B reconstruction, and require the reconstructed q > .For this mock-up, we show in Fig. 28 the ratio of theNP experimental acceptance compared to the SM, ε/ε SM ,for several diﬀerent simpliﬁed models (cf. (Lees et al. ,2013) which studied this eﬀect for the Type-II 2HDM).To characterize the sensitivity to the q cut and smearing,we also show acceptances for better and poorer q reso-lutions, with widths of 0 . for the Gaus-sian smearing, respectively. To provide further insightinto the NP variablity of the diﬀerential distributions, inFig. 29 we show the percent variation per bin in the re-constructed m normalized distribution for B → Dτ ν for the same set of simpliﬁed models, over the identical range of NP couplings, as well as example B → D ( ∗ ) τ ν distributions in the reconstructed m for particular NPcoupling values.One sees typically a few percent variation in the ac-ceptances as well as in the diﬀferential m distribu-tion, with up to 5% or so variations in some cases. Al-though this might seem small in comparison to the typi-cal 15–20% size of currently measured LFUV in R ( D ( ∗ ) ),such variations are already comparable to the typical sizeof systematic uncertainties in current analyses, such asthose shown in Table X. It is then not surprising that mis-matches between SM and NP signal template can intro-duce signiﬁcant biases into analyses. This was observedin the BaBar analysis (Lees et al. , 2013). A similar, butmore detailed, mock-up analysis in an e + e − collider envi-ronment suggests biases at greater than the 4 σ level maybe expected to typically arise with 5 ab − of data (Bern-lochner et al. , 2020a). This eﬀect may also be impor-tant in the extraction of the CKM parameter | V cb | , whichis sensitive to the assumed form factor parametrizationused to generate the ﬁt templates.Future semileptonic analyses may address these biasesthrough a variety of approaches. We discuss these belowin Sec. VII.B. II.A Prospects and Outlook E. Connection to FCNCs

Measurements of the b → s(cid:96)(cid:96) ratios R K ( ∗ ) (Eq. (43)) invarious ranges of the dilepton invariant mass have pro-duced an indication of lepton ﬂavor universality viola-tion. For instance, the current world-average ratios inthe range q = m ( (cid:96)(cid:96) ) ∈ [1 . , .

0] GeV are (Amhis et al. ,2019) R K + = 0 . +0 . − . , R K ∗ = 0 . +0 . − . , (79)but are expected to be unity to the sub-percent level.The deviations are driven dominantly by LHCb measure-ments (Aaij et al. , 2017b, 2019c). Angular analyses of B → K ∗ µµ decays exhibit components that are in sim-ilar tension with theoretical predictions, but subject topotentially large theory uncertainties. However, variousother less precise measurements of R K ( ∗ ) from Belle and B A B AR are consistent with unity (Amhis et al. , 2019); seealso the recent Λ b → pK(cid:96)(cid:96) analysis by LHCb (Aaij et al. ,2020b). As discussed in Sec. II.I, because the neutrinobelongs to an electroweak doublet, non-trivial (model-dependent) connections may arise between b → c(cid:96)ν and b → s(cid:96)(cid:96) or b → sνν operators. Studies of possible con-nections between the R ( D ( ∗ ) ) and R K ( ∗ ) anomalies thusexplore common origins of NP in b → cτ ν versus b → s(cid:96)(cid:96) ,such as various leptoquark mediators and ﬂavor models,that are not also excluded by other precision measure-ments (see Sec. II.I). See e.g (Bhattacharya et al. , 2015;Buttazzo et al. , 2017; Calibbi et al. , 2015; Kumar et al. ,2019), as several representatives of an extensive litera-ture.In light of these results, it is also interesting to con-sider how much LFUV can be tolerated in the electronversus muon couplings from b → clν measurements alone.As above, the Belle direct measurement (Eq. (71)) con-strains LFUV to no more than percent level deviationsbetween the electron and muon semileptonic modes.An additional constraint arises from exclusive mea-surements of | V cb | and associated q distributions from B → D ( ∗ ) (cid:96)ν decays. Though not a focus of this review,these measurements are presently quite sensitive to the B → D ( ∗ ) form factor parametrization: Precision ﬁtsleave little room for the presence of additional form fac-tors beyond those of the V − A interactions, because intro-ducing such form factors would signiﬁcantly distort thewell-measured q distributions for these decays. More-over, shape ﬁts to the electron and muon modes sep-arately are in good agreement (see e.g. (Aubert et al. ,2009; Glattauer et al. , 2016; Waheed et al. , 2019)). Theseresults suggest that in the b → ceν and b → cµν sys-tems, one can plausibly introduce NP only via V − A NPcurrents, and one can plausibly produce electron-muonLFUV at most at the percent level. Based on this qual-itative discussion we eagerly anticipate further quantita-tive studies of bounds on LFUV in B → D ( ∗ ) (cid:96)ν . VII. PROSPECTS AND OUTLOOK

As detailed in Sec. VI, the world averages for R ( D ) and R ( D ∗ ) currently exceed their SM predictions by about14% each. While the theory uncertainties on the R ( D ( ∗ ) )SM predictions are already 1–2% (see Tab. I), the un-certainties on the corresponding measurements are 5–10times larger. If key challenges in computation, the mod-eling of b -hadron semileptonic decays, and backgroundestimation are met in the years to come, as discussed inSec. V, the large amount of data that LHCb and Belle IIwill collect over the next two decades will bring downthe experimental uncertainties to the 1% level. At thepresent level of discrepancy with the SM, this degree ofprecision would nominally be suﬃcient to either establishan observation of LFUV or resolve the present anomalies.However, highly signiﬁcant but isolated results will ar-guably not be suﬃcient to fully establish the presenceof NP in this manner, given the vast number of experi-mental and theoretical eﬀects that can inﬂuence the in-terpretation of these indirect searches for BSM physics.Spurred on by the R ( D ( ∗ ) ) anomalies, a wide program ofLFUV measurements and calculations, that encompassesseveral experimental and theoretical communities acrossparticle physics, will likely be the key to disentanglingpotential BSM signals from sources of uncertainty thatmay not be fully understood.To this end, in this last section we discuss various as-pects of this program, including: eﬀorts underway tomeasure other important ratios such as R ( J/ψ ), R ( π ), R ( D ( ∗ )( s ) ) and R (Λ c ) (Sec. VII.A); analyses that exploitthe fully diﬀerential information measured in semitauonic b -hadron decays to complement and enhance the sensitiv-ity to NP (Sec. VII.B); and should these indirect searchesend up establishing the presence of NP, the role of pro-posed future colliders, that may be able to either directlyobserve NP mediators, or further characterize establishedanomalies with related measurements (Sec. VII.C). A. Measurement of the ratios R ( H c,u ) As described throughout this review, the ratios R ( H c,u ) deﬁned in Eq. (21) are powerful probes of LFUVand NP, in part because of the signiﬁcant cancelation oftheoretical and experimental uncertainties in the ratios.The SM predictions for R ( D ( ∗ )( s ) ), R ( J/ψ ), and R (Λ c ) nowhave uncertainties in the 1–3% range (see Sec. II), andimprovements in lattice QCD together with new experi-mental measurements are expected to further bring thesedown. Over the next two decades, LHCb and Belle II willcollect enough data to reduce the statistical uncertaintyon the R ( H c,u ) measurements down to a few percent orless. However, the systematic uncertainties on the bestknown ratios, R ( D ( ∗ ) ), are currently signiﬁcantly higherthan that, as shown in Tab. XV. Thus, quantifying the II.A Prospects and Outlook R ( H c,u ) as a probe of NP afterLHCb and Belle II complete their data taking rests pri-marily on estimating the extent to which the associatedexperimental systematic uncertainties can be reduced.As detailed in Sec. V, if already ongoing theoretical andexperimental eﬀorts are sustained in the following years,the majority of the systematic uncertainty on R ( H c,u ) isexpected to decrease commensurately with the increas-ing size of the data samples being collected. For instance,the uncertainty from the background contributions willdecrease as the data control samples grow, and the size ofthe simulated data samples will continue increasing pro-portionately if the power of GPUs and fast simulation al-gorithms is appropriately harnessed. Of course, these im-provements are likely to have their own limitations, and acertain level of irreducible systematic uncertainty will bereached. Based on the considerations described in Sec. V,one may estimate that ﬂoors of ∼ R ( D ( ∗ ) ) are achievable, while a ﬂoor of ∼ R ( H c,u ) ratios, in which the form factorparameterization cannot be measured as precisely. Toillustrate the variability of these estimations, we presentextrapolations for the anticipated R ( H c,u ) precision thatLHCb and Belle II are likely to reach under two scenar-ios: (i) a pessimistic scenario, with irreducible systematicuncertainties of 2% for R ( D ( ∗ ) ) and 5% for the other R ( H c,u ) ratios; and (ii) an optimistic scenario, with un-certainty ﬂoors of 0.5% for R ( D ( ∗ ) ) and 3% for the other R ( H c,u ) ratios. Further assumptions included in theseextrapolations are detailed below.

1. Prospects for R ( H c,u ) at LHCb As described in Sec. III.A, the high center-of-mass en-ergy at the LHC gives LHCb access to large samples ofmany b -hadron species. So far, LHCb has published re-sults on R ( D ∗ ) and R ( J/ψ ) (see Sec. IV), and measure-ments of R ( D ), R ( D ∗∗ ), R ( D s ), R ( D ∗ s ), R (Λ c ), R (Λ ∗ c )as well as the non-semitauonic ratios R ( D ( ∗ ) ) light are un-derway. We can project the sensitivity to some of theseratios based on the b -hadron samples expected in thenext two decades (Tab. III), the reduction of the system-atic uncertainty described above, and the following broadassumptions: (i) R ( D ∗ ): The current Run 1 results for R ( D ∗ + ) havea total uncertainty of 12%, but this value should be re-duced by about √ R ( D ∗ ) is also included in themeasurement. This can be done by inclusively recon-structing B − → D ∗ τ − ν τ decays via their feed-down These projections are for the measurements that employ themuonic decays of the τ lepton. The projections for the hadronicmeasurements would be similar except that the irreducible sys-tematic uncertainty would be asymptotically higher because ofthe external branching fractions used to normalize the result. to D µ − samples in combined R ( D )– R ( D ∗ ) measure-ments. Starting in Run 2, a dedicated trigger achieved50% higher eﬃciency and the bb cross section increasedby a factor of around two. Another factor of two willbe gained when the hardware trigger is replaced by asoftware-only trigger starting in the next data taking pe-riod (Run 3). (ii) R ( D ): The same assumptions apply as for themeasurement of R ( D ∗ ) in terms of triggers and the com-bination of D and D + , but data samples are expected tobe about 50% smaller due to the diﬀerence in branchingfractions and R ( D ). (iii) R ( D ∗∗ ): The projections are speciﬁcally for R ( D ) which provides the most accessible ﬁnal state.The projections are based on the expected uncertainty ofabout 15% for a combined analysis of Run 1 and 2 data,and include a factor of two eﬃciency increase starting inRun 3 thanks to the software-only trigger. (iv) R ( D ( ∗ ) s ): Data samples are expected to be about16 times smaller than for R ( D ∗ ), because of both thesmaller B s production fraction as well as the require-ment to reconstruct an additional track in the D + s → K + K − π + decay (resulting in about a factor of two lowereﬃciency). (v) R (Λ c ): Data samples are expected to be six timessmaller than for R ( D ∗ ), according to the smaller Λ b pro-duction fraction, as well as the requirement to recon-struct an additional track in the Λ + c → pK − π + decay. (vi) R (Λ ∗ c ): The projections are speciﬁcally for R (Λ ∗ c (2625)) and follow the same assumptions as for R (Λ c ) but with 33 times smaller data samples due tothe smaller Λ b → Λ ∗ c branching fraction and the lowereﬃciency of the Λ ∗ c → Λ c ππ reconstruction. This is esti-mated in a preliminary LHCb study of Λ b → Λ ( ∗ ) c πππ events assumming that the ratio of Λ b → Λ ( ∗ ) c πππ branching fractions is the same as that for Λ b → Λ ( ∗ ) c τ ν .The projections for R (Λ ∗ c (2595)) would be similar butwith data samples a factor of two smaller than those for R (Λ ∗ c (2625)). (vii) R ( J/ψ ): We scale the 2018 result based on theexpected data samples.Figure 30 shows the results of these projections. Theyears on the horizontal axis refer to the dates at whichdata samples become available, that would eventually re-sult in the plotted total uncertainties once analyses arecompleted. For instance, the 8.5% uncertainty on R ( D ∗ )shown at the beginning of 2015 corresponds to the even-tual precision achievable for the combined measurementof R ( D ∗ + ) and R ( D ∗ ) with the Run 1 data sample,but the analysis is not expected to be completed until2021. These projections illustrate the enormous beneﬁtthat the data samples collected after the ongoing LHCb Upgrade I will have on the measurement of R ( H c ). Theproposed LHCb Upgrade II , which would take place in2031, would allow LHCb to further improve the precision

II.A Prospects and Outlook Data sample up to year024681012141618 T o t a l un c e r t a i n t y [ % ] Run 2 Run 3 Run 4 Run 5 Run 6

LHCb unofficial

Pessimistic R ( D ) R ( D ∗ ) R ( D ∗∗ ) R ( D ( ∗ ) s ) R (Λ c ) R (Λ ∗ c ) R ( J/ Ψ) Data sample up to year024681012141618 T o t a l un c e r t a i n t y [ % ] Run 2 Run 3 Run 4 Run 5 Run 6

LHCb unofficial

Optimistic R ( D ) R ( D ∗ ) R ( D ∗∗ ) R ( D ( ∗ ) s ) R (Λ c ) R (Λ ∗ c ) R ( J/ Ψ) Figure 30 Projections for the expected precision on the measurement of selected R ( H c ) ratios at LHCb as a function of the yearin which the corresponding data sample becomes available. Left: pessimistic scenario for an irreducible systematic uncertaintyof 3% on R ( D ( ∗ ) ) and 5% on the other ratios. Right: optimistic scenario for an irreducible systematic uncertainty of 0.5% on R ( D ( ∗ ) ) and 2% on the other ratios. These extrapolations are based on the current muonic- τ measurements of R ( D ( ∗ ) ) and R ( J/ψ ), as well as the forthcoming hadronic- τ measurement of R ( D ) for the R ( D ∗∗ ) curve. The R (Λ ∗ c ) entry in the legendrefers to R (Λ ∗ c (2625)). The shaded regions correspond to the long shutdowns during which there is no data taking at the LHCand have been updated including the latest estimates (B´ejar Alonso et al. , 2020). on these ratios down to the 0 . B → ppτ ν decays is un-derway and LHCb also has plans to measure Λ b → pτ ν .As described in Sec. II.F, these b → uτ ν transitions areespecially interesting because their potential NP cou-plings could be in principle quite diﬀerent from thosepotentially involved in b → cτ ν transitions.

2. Prospects for R ( H c,u ) at Belle II Belle II will proﬁt from the much cleaner environmentof B meson pair production in electron-positron anni-hilations, i.e. even with its smaller data samples withrespect to LHCb, highly competitive results will emerge.One of the major challenges will be to retain this cleanenvironment at high luminosities and reduce the impactof beam and other backgrounds as much as possible. Inaddition, several orthogonal data sets can be obtainedleveraging diﬀerent analysis or tagging approaches (seeSection III.C.1). The most important results will be: (i) R ( D ( ∗ ) ) with exclusive tagging: In principle fourstatistically independent measurements can be carriedout this way, namely either with hadronic or semileptonictagging and with the focus on either leptonic or hadronic τ -lepton decays. The results with the best control of thesystematic uncertainty will be obtained from the com-bination of hadronic tagging and leptonic or hadronic τ decays. For these, the B -rest frame will be accessibleand, in the case of hadronic single-prong τ decays, the τ polarization will also be accessible. These results will suf-fer, however, from the low overall eﬃciency of hadronictagging caused by the small branching fractions of suchprocesses.Semileptonically tagged events will retain much highernumbers of semitauonic decays, but these will in principlesuﬀer from higher systematic uncertainties. Nonetheless,all reconstructed particles in such signatures can still beassigned to either the signal or tag side, which will allowfor reliable measurements. It is worth noting that ad-ditional energy depositions from beam background pro-cesses will lead to more challenging conditions than thepresent-day results. Further, only measurements withleptonic τ decays have been realized to date, so it will bean exciting challenge for Belle II to establish measure-ments with hadronic τ decays using this technique. (ii) R ( D ( ∗ ) ) with inclusive or semi-inclusive tagging:Compared to hadronic or semileptonic tagging, inclusivetagging oﬀers much higher reconstruction eﬃciency atthe cost of higher backgrounds and lower precision in thereconstruction of B -frame kinematic variables. Nonethe-less, such measurements will oﬀer additional orthogonaldata sets that can be analyzed. A particularly interestingoption might involve the use of semi-inclusive tagging viaa charmed seed meson ( D , D ∗ , J/ψ , D s , or D ∗ s ). Suchan approach could oﬀer more experimental control thanpurely inclusive tagging, while still retaining a high re-construction eﬃciency. It is unclear at present how pre-cise such measurements will be, as no detailed studieshave been carried out, and we therefore do not include II.A Prospects and Outlook Data sample up to year024681012141618 T o t a l un c e r t a i n t y [ % ] Belle II unofficial

Pessimistic R ( D ∗ ) (had FEI, lep τ ) R ( D ) (had FEI, lep τ ) R ( D ∗ ) (SL FEI, lep τ ) R ( D ) (SL FEI, lep τ ) R ( D ∗ ) (had FEI, had τ ) R ( X ) (had FEI, lep τ ) R ( π ) (had FEI) Data sample up to year024681012141618 T o t a l un c e r t a i n t y [ % ] Belle II unofficial

Optimistic R ( D ∗ ) (had FEI, lep τ ) R ( D ) (had FEI, lep τ ) R ( D ∗ ) (SL FEI, lep τ ) R ( D ) (SL FEI, lep τ ) R ( D ∗ ) (had FEI, had τ ) R ( X ) (had FEI, lep τ ) R ( π ) (had FEI) Figure 31 Projections for the expected precision on the measurements of R ( D ( ∗ ) ), R ( X ), and R ( π ) at Belle II as a function ofthe year in which the corresponding data sample becomes available. An irreducible systematic uncertainty of (left) 3% for thepessimistic scenario and (right) 0.5% for the optimistic one is assumed. The optimistic scenario also assumes and 50% increasein the reconstruction eﬃciency of the exclusive tagging algorithms. The shaded regions indicate years in which signiﬁcantdown-time is expected due to upgrades of the detector and/or the accelerator. these in our projections. (iii) R ( π/ρ/ω ): Belle II will have a unique opportu-nity to further investigate semitauonic processes involv-ing b → u transitions. The existing search (detailed in.Sec. IV.A.2) focused on charged pion ﬁnal states. In-teresting additional channels with higher branching frac-tions are decays to ρ and ω mesons, although the largewidth of the ρ is a challenge. Nonetheless, Belle II willimprove the existing limits and with a substantial dataset of 10–15 ab − the discovery of these decays, assumingtheir branching fraction is of the size of the SM expecta-tion, is feasible. (iv) R ( D ( ∗ ) s ): Belle II anticipates collecting a cleansample of e + e − → Υ (5 S ) → B ( ∗ ) s B ( ∗ ) s events. The ex-perimental methodology applied to the study of semi-tauonic B meson decays can also be applied to these datasets. For instance, future measurements of R ( D ( ∗ ) s ) canbe done based on hadronic or semileptonic tagging in asimilar fashion to the R ( D ( ∗ ) ) measurements. It is un-clear, however, whether a precision can be reached thatwould rival LHCb, because of the much smaller numberof produced B s mesons. (v) R ( X ( c ) ) with hadronic tagging: Belle II will fur-ther be able to produce measurements of fully inclusiveor semi-inclusive semitauonic ﬁnal states. These will al-low measurements of R ( X ( c ) ). We use the preliminarymeasurement of (Hasenbusch, 2018) to estimate the sen-sitivity for R ( X ).Figure 31 displays the expected sensitivity as a func-tion over time. The left panel displays our pessimisticscenario based on the statistical and systematic uncer-tainties of existing measurements and an irreducible sys- tematic uncertainty of 3% as described above. The rightpanel shows the same progression for the optimistic sce-nario, that includes an irreducible systematic uncertaintyof 0.5% and an increase in the eﬃciency of the exclu-sive tagging algorithms of 50%. Such an improvementis not completely unexpected since novel ideas, such asthe use of deep learning concepts and attention maps,have already shown promising eﬃciency gains in simu-lated events (Tsaklidis, 2020). However, it remains to beseen if such eﬃciency gains are also retained in the anal-ysis of actual collision events, and if the identiﬁed eventsare clean enough to provide an actual gain in sensitivity.In both scenarios the uncertainties are expected to de-crease with luminosity until the systematic uncertaintyﬂoor is reached.The grey bands indicate years in which signiﬁcantdown-time is expected due to upgrades of the detec-tor and/or the accelerator. In 2022, the Belle II pixeldetector will be replaced with its ﬁnal version, andmore radiation-hard photomultipliers for the time-of-propagation-detector will be integrated as well. In 2026,the Belle II interaction region will be upgraded to allowfor the increase of the instantaneous luminosity to its de-sign value: The superconducting magnets that performthe ﬁnal focusing will be placed further away from thebeam crossing point to reduce the chance of quenches.Measurements of R ( D ∗ ) will be somewhat more precisebecause of their cleaner signature and lack of feed-downcontributions, compared to R ( D ) measurements, but inboth cases a precision of 4–5% and about 3% will bereached by 2026 in the pessimistic and optimistic scenar-ios, respectively. Inclusive R ( D ( ∗ ) ) measurements and II.C Prospects and Outlook R ( D ∗ ) with hadronic τ ﬁnal states willreach 3.5% precision in the pessimistic scenario and be-low 2% in the optimistic case. All measurements, exceptfor the ones explicitly probing b → u transitions, willreach precisions close to their irreducible systematic un-certainties by 2031. B. Exploiting full diﬀerential information

1. Angular analyses and recovered observables

A 2–3% systematic ﬂoor for LFUV ratio measurementsmight be reached quickly given the high statistical powerprovided by LHCb and Belle II experiments together.This, combined with the fact that the ratios R ( H c,u ) arerecovered observables from template ﬁts to diﬀerentialdistributions, suggests that attention might increasinglyturn towards other measurable properties. These includeangular correlations, longitudinal and polarization frac-tions of the D ∗ and τ (see Sec. II.D.2), or asymmetriesand so on.Many such observables using angular correlations havebeen put forward by a wide range of phenomenologicalstudies, in particular as a means to distinguish SM fromNP interactions in b → cτ ν transitions. On the ex-perimental side, the most accessible of these is the D ∗ longitudinal fraction, F L,τ ( D ∗ ), which can be easily re-constructed. As discussed in Sec. IV.D.2, Belle has al-ready provided a preliminary measurement for this vari-able based on B → D ∗ τ ν decays. This result is com-patible with the SM expectations within 2 σ . LHCb isexpected to soon publish a similar analysis with slightlyimproved sensitivity.The τ polarization (Sec. IV.D.1) has also been mea-sured for the ﬁrst time by Belle, using the τ → πν single-prong decay channel, though with limited preci-sion. Preliminary studies in LHCb have demonstratedthat the measurement of the τ polarization is possibleusing the τ → π − π + π − ν decay mode, recycling tech-niques developed at LEP times involving the optimizedvariable ω (Davier et al. , 1993). This analysis is muchmore complex than the single-prong mode, in which thepion momentum in the τ rest frame acts as an in-principleperfect polarizer, because the analyzing power of the πππ ﬁnal state is comparatively small (see Eq. (28)):The analyzing power of the dominant a resonance in τ → π − π + π − ν features a numerical cancellation on-shell, α a = (1 − m a /m τ ) / (1 + 2 m a /m τ ) (cid:39) .

02. The ex-pected LHCb sensitivity to P τ ( D ( ∗ ) ) in the three-prongmode is not yet known.A recent study (Hill et al. , 2019) has shown that LHCbmay be able to reliably recover the angular coeﬃcientsdescribing the B → ( D ∗ → Dπ )( τ → hν ) ν decay, assum-ing a sample around 10 signal events. A dataset of thissize is expected to be available at the end of Run 3 of the LHC; ﬁrst attempts along these lines may be performedusing the full Run 2 dataset.

2. Future strategies

However, as discussed in Sec. VI.D.2, mismatches be-tween SM and NP signal templates can introduce sig-niﬁcant biases into analyses that consider recovered ob-servables, such that one cannot consistently determinethe compatibility of the data with any particular NPmodel. Future semileptonic analyses may address thesebiases through a variety of approaches: One possibil-ity is to attempt to carefully control the size of thesebiases when experiments quote their results. A diﬀer-ent, more robust, approach is for experiments to adapttheir analyses such that instead of reporting recoveredobservables, they instead perform ﬁts directly in the mul-tidimensional space of the NP couplings—the Wilsoncoeﬃcients—themselves. This approach has the addi-tional advantage of making it more straightforward tocombine results from diﬀerent experiments.The latter approach is sometimes referred to as‘forward-folding’. A key obstacle is that generating suﬃ-cient simulated data for the SM analysis alone is challeng-ing (see Sec. V.A); generating enough to study a space ofNP models is naively computationally prohibitive. Thisdiﬃculty can be resolved, however, with matrix elementreweighting, which allows for large MC samples to beconverted from the SM to any desired NP template,or to any description of the hadronic matrix elements,without regenerating the underlying MC data. In re-cent years, new software tools, such as the

Hammer li-brary (Bernlochner et al. , 2020b), have been developedby experimental-theory collaborations to permit fast andeﬃcient MC reweighting of this type.As an example, one can consider the mock-up reweight-ing analysis of (Bernlochner et al. , 2020a), that usesthe diﬀerential information in the missing invariant mass m and lepton momentum | p (cid:96) | , including an approxi-mation of the eﬀects of various backgrounds and recon-struction eﬀects. In Fig. 32 we show the potential recov-ered CLs from this analysis for the (complex) NP Wil-son coeﬃcients of the R simpliﬁed model, deﬁned by c SL (cid:39) c T , compared to the ‘truth’ value c SL (= 8 c T ) =0 . i ). This mock-up forecasts that with 5 ab − of fu-ture data, one would be able to not only exclude the SM,but also recover the ‘true’ NP Wilson coeﬃcient up to amild two-fold degeneracy in its imaginary part. Becausethe forward folding approach can use all diﬀerential in-formation by construction, it may supersede approachesbased on measuring recovered observables. eferences − − ) qLlL Re(S − − − ) q L l L I m ( S contours hold 68%, 95% CL (etc.) I m [ c SL ] Re[ c SL ] Figure 32 The 68%, 95% and 99% CL allowed regions for the R simpliﬁed model coupling c SL = 8 c T ﬁtting to an Asimovdata set with c SL = 8 c T = 0 . i ). The best ﬁt recoveredpoints are shown by gray dots. C. Outlook for future colliders

If NP were to be discovered through indirect LFUVsearches, future colliders could be instrumental in fur-ther characterizing the nature of the new interactions.In some scenarios, NP mediators can escape the discov-ery reach of the High Luminosity (HL)-LHC while stillgiving rise to the observation of LFUV in semitauonic b -hadron decays. Future hadron machines such as theFCC- hh collider (Mangano et al. , 2018), presently understudy at CERN, would extend the reach for direct obser-vation of NP mediators into the multi-TeV range, cover-ing most of these scenarios. An indirect NP observationcould also be possible at FCC- hh by, e.g., detecting de-viations from the predicted inclusive τ τ production ratein the SM (Mangano et al. , 2018).High luminosity e + e − colliders may also play a crucialrole because the characteristics of b -hadron production onthe Z pole combines several of the advantages enjoyed by B -factory experiments with those of hadron colliders. Inparticular, the advantages of the former include: a veryfavourable ratio of B production divided by total crosssection (22%); a low multiplicity environment (perfectseparation of the two B mesons); and good knowledgeof the B center-of-mass frame by exploiting jet directionmeasurements and the peaked fragmentation function.The advantages of the latter include large production ofall b -hadron species, and the large boost of the hadronsthemselves, which allows one to more easiy separate theirdecay products from primary fragments, and to fully re-construct secondary and tertiary vertices.The ‘TeraZ’ class of proposed e + e − colliders—eitherthe FCC- ee (Abada, 2019) or CEPC (Dong et al. ,2018a,b)—could provide enough B mesons produced inthis very favorable Z -pole environment to measure very complex decays such as B + → K + τ + τ − , that are verydiﬃcult to probe otherwise (Kamenik et al. , 2017). Aprecise measurement of this branching ratio and its an-gular distributions would provide a critical test of LFUVin the neutral current decays involving the τ lepton. Thismight in turn provide evidence of a link between theLFUV hints from R ( H c,u ), involving charged current de-cays to τ leptons, and those of R K ( ∗ ) , involving neu-tral current decays to the ﬁrst two lepton families only(see Sec. II.I). In a similar vein, rare B c decays such as B c → τ ν could also be studied at a TeraZ factory (Zheng et al. , 2020). A precision of 1% of this branching fractioncould be reached providing strong constraints on manyNP models. D. Parting thoughts

In this review we have provided an in-depth look intothe theoretical and experimental foundations for semi-tauonic LFUV measurements. This comprised a detailedoverview of the theoretical state-of-the-art and an exten-sive survey of the experimental environments and mea-surement methodologies at the B factories and LHCb.We further reexamined the current combinations and NPinterpretations of the data as well as their limitations,and the future prospects to control systematic uncertain-ties, all of which will be crucial for not only establishinga tension with the SM, should one exist, but also under-standing the nature of the New Physics responsible forit.Driven by the intriguing and persistent anomalies in R ( D ( ∗ ) ), the host of planned and ongoing measurementsof lepton ﬂavor universality violation in semitauonic b -hadron decays will provide new data-driven insights, ifnot resolutions, for these current LFUV puzzles. Agolden era in ﬂavor physics is just ahead of us. ACKNOWLEDGMENTS

We would like to thank Hassan Jawahery and ZoltanLigeti for their comments on the manuscript. We thankAna Ovcharova for her help with the formatting of sev-eral plots. We thank Patrick Owen for sharing his workon the LHCb projections for R ( H c ) and subsequent dis-cussions. We thank CERN for its hospitality during theinitial preparation of this work. FB is supported byDFG Emmy-Noether Grant No. BE 6075/1-1 and BMBFGrant No. 05H19PDKB1. MFS is supported by the Na-tional Science Foundation under contract PHY-2012793.DJR is supported in part by the Oﬃce of High EnergyPhysics of the U.S. Department of Energy under contractDE-AC02-05CH11231. eferences REFERENCES

Aad, Georges, et al. (ATLAS) (2020), “Test of the universalityof τ and µ lepton couplings in W -boson decays from t ¯ t events with the ATLAS detector,” arXiv:2007.14040 [hep-ex].Aaij, Roel, et al. (LHCb) (2011), “Measurements of theBranching fractions for B ( s ) → D ( s ) πππ and Λ b → Λ + c πππ ,” Phys. Rev. D , 092001, [Erratum: Phys.Rev.D85, 039904 (2012)], arXiv:1109.6831 [hep-ex].Aaij, Roel, et al. (LHCb) (2015a), “LHCb detector per-formance,” International Journal of Modern Physics A (07), 1530022.Aaij, Roel, et al. (LHCb) (2015b), “Measurement of b + c pro-duction in proton-proton collisions at √ s = 8 tev,” PhysicalReview Letters (13), 10.1103/physrevlett.114.132001.Aaij, Roel, et al. (LHCb) (2015c), “Measurement of the ra-tio of branching fractions B ( ¯ B → D ∗ + τ − ¯ ν τ ) / B ( ¯ B → D ∗ + µ − ¯ ν µ ),” Phys. Rev. Lett. , 111803, [Addendum:Phys. Rev. Lett.115,no.15,159901(2015)], arXiv:1506.08614[hep-ex].Aaij, Roel, et al. (LHCb) (2017a), “Measurement of the b-quark production cross section in 7 and 13 tev pp col-lisions,” Physical Review Letters (5), 10.1103/phys-revlett.118.052002.Aaij, Roel, et al. (LHCb) (2017b), “Test of lepton uni-versality with B → K ∗ (cid:96) + (cid:96) − decays,” JHEP , 055,arXiv:1705.05802 [hep-ex].Aaij, Roel, et al. (LHCb) (2018a), “Measurement of the ratioof branching fractions B ( B + c → J/ψτ + ν τ ) / B ( B + c → J/ψµ + ν µ ),” Phys. Rev. Lett. (12), 121801,arXiv:1711.05623 [hep-ex].Aaij, Roel, et al. (LHCb) (2018b), “Test of Lepton Fla-vor Universality by the measurement of the B → D ∗− τ + ν τ branching fraction using three-prong τ decays,”Phys. Rev. D (7), 072013, arXiv:1711.02505 [hep-ex].Aaij, Roel, et al. (LHCb) (2019a), “Measurement of b hadronfractions in 13 tev pp collisions,” Phys. Rev. D , 031102.Aaij, Roel, et al. (LHCb) (2019b), “Measurement of the rel-ative B − → D /D ∗ /D ∗∗ µ − ν µ branching fractions using B − mesons from B ∗ s decays,” Phys. Rev. D (9), 092009,arXiv:1807.10722 [hep-ex].Aaij, Roel, et al. (LHCb) (2019c), “Search for lepton-universality violation in B + → K + (cid:96) + (cid:96) − decays,” Phys.Rev. Lett. (19), 191801, arXiv:1903.09252 [hep-ex].Aaij, Roel, et al. (LHCb) (2020a), “Determination of quan-tum numbers for several excited charmed mesons observedin B − → D ∗ + π − π − decays,” Phys. Rev. D , 032005.Aaij, Roel, et al. (LHCb) (2020b), “Test of lepton uni-versality with Λ b → pK − (cid:96) + (cid:96) − decays,” JHEP , 040,arXiv:1912.08139 [hep-ex].Abada, A et al (2019), “FCC-ee: The Lepton Collider,”Eur.Phys.J.ST (2), 261–623.Abashian, A, et al. (2002), “The Belle detector,” Nucl. In-strum. Meth. A479 , 117–232.Abbiendi, G, et al. (OPAL) (2001), “Measurement of thebranching ratio for the process b — > tau- anti-nu(tau)X,” Phys. Lett. B , 1–10, arXiv:hep-ex/0108031.Abdesselam, A, et al. (Belle) (2017), “Precise determinationof the CKM matrix element | V cb | with ¯ B → D ∗ + (cid:96) − ¯ ν (cid:96) decays with hadronic tagging at Belle,” arXiv:1702.01521[hep-ex].Abdesselam, A, et al. (Belle) (2019), “Measurement of the D ∗− polarization in the decay B → D ∗− τ + ν τ ,” in ,arXiv:1903.03102 [hep-ex].Abe, T et al (Belle-II Collaboration) (2010), “Belle ii technicaldesign report,” arXiv:1011.0352 [physics.ins-det].Abreu, P, et al. (DELPHI) (2000), “Upper limit for the decay B − → τ − anti-neutrino ( τ ) and measurement of the b → τ anti-neutrino ( τ ) X branching ratio,” Phys. Lett. B ,43–58.Abudin´en, F, et al. (Belle-II) (2020), “A calibration of theBelle II hadronic tag-side reconstruction algorithm with B → X(cid:96)ν decays,” arXiv:2008.06096 [hep-ex].Acciarri, M, et al. (L3) (1994), “Measurement of the inclusiveB — > tau-neutrino X branching ratio,” Phys. Lett. B ,201–208.Acciarri, M, et al. (L3) (1996), “Measurement of the branch-ing ratios b – > e neutrino X, mu neutrino X, tau-neutrinoX and neutrino X,” Z. Phys. C , 379–390.Adachi, I, et al. (Belle Collaboration) (2009), “Measure-ment of B → D ( ∗ ) τ ν using full reconstruction tags,”arXiv:0910.4301 [hep-ex].Akai, Kazunori, Kazuro Furukawa, and Haruyo Koiso (Su-perKEKB) (2018), “Superkekb collider,” Nucl. Instrum.Meth. A , 188–199, arXiv:1809.01958 [physics.acc-ph].Akeroyd, AG, and Chuan-Hung Chen (2017), “Constrainton the branching ratio of B c → τ ¯ ν from LEP1 and con-sequences for R ( D ( ∗ ) ) anomaly,” Phys. Rev. D (7),075011, arXiv:1708.04072 [hep-ph].Albrecht, Johannes, Matthew John Charles, Laurent Dufour,Matthew David Needham, Chris Parkes, Giovanni Passal-eva, Andreas Schopper, Eric Thomas, Vincenzo Vagnoni,Mark Richard James Williams, and Guy Wilkinson(2019), Luminosity scenarios for LHCb Upgrade II , Tech.Rep. LHCb-PUB-2019-001. CERN-LHCb-PUB-2019-001(CERN, Geneva).Alok, Ashutosh Kumar, Dinesh Kumar, Suman Kumbhakar,and S Uma Sankar (2017), “ D ∗ polarization as a probe todiscriminate new physics in ¯ B → D ∗ τ ¯ ν ,” Phys. Rev. D (11), 115038, arXiv:1606.03164 [hep-ph].Alonso, Rodrigo, Benjam´ın Grinstein, and Jorge Martin Ca-malich (2017), “Lifetime of B − c Constrains Explanationsfor Anomalies in B → D ( ∗ ) τ ν ,” Phys. Rev. Lett. (8),081802, arXiv:1611.06676 [hep-ph].Altmannshofer, W, et al. (Belle-II) (2019), “The Belle IIPhysics Book,” PTEP (12), 123C01, [Erratum:PTEP 2020, 029201 (2020)], arXiv:1808.10567 [hep-ex].Alves, Jr, A A, et al. (LHCb) (2008), “The LHCb detector atthe LHC,” JINST , S08005.Amhis, Yasmine Sara, et al. (HFLAV) (2019), “Aver-ages of b -hadron, c -hadron, and τ -lepton propertiesas of 2018,” updated results and plots available at https://hflav.web.cern.ch/ , arXiv:1909.12524 [hep-ex].Aoki, S, et al. (Flavour Lattice Averaging Group) (2020),“FLAG Review 2019,” Eur. Phys. J. C (2), 113,arXiv:1902.08191 [hep-lat].Aubert, B, et al. (BaBar) (2008), “Observation of the semilep-tonic decays B → D ∗ τ − ¯ ν τ and evidence for B → Dτ − ¯ ν τ ,”Phys. Rev. Lett. , 021801, arXiv:0709.1698 [hep-ex].Aubert, B, et al. (BaBar) (2013), “The BABAR detector: up-grades, operation and performance,” Nucl. Instrum. Meth. A729 , 615–701, arXiv:1305.3560 [physics.ins-det].Aubert, Bernard, et al. (BaBar) (2009), “Measurements ofthe Semileptonic Decays ¯ B → D(cid:96) ¯ ν (cid:96) and ¯ B → D ∗ (cid:96) ¯ ν (cid:96) Using eferences a Global Fit to DX(cid:96) ¯ ν (cid:96) Final States,” Phys. Rev. D ,012002, arXiv:0809.0828 [hep-ex].Augusto Alves Jr, Antonio, et al. (LHCb) (2008), “The LHCbdetector at the LHC,” Journal of Instrumentation (08),S08005–S08005.Bailas, G, S. Hashimoto, T. Kaneko, and J. Koponen(JLQCD) (2019), “Study of intermediate states in the in-clusive semi-leptonic B → X c (cid:96)ν decay structure functions,”PoS LATTICE2019 , 148, arXiv:2001.11678 [hep-lat].Bailey, Jon A, et al. (Fermilab Lattice, MILC) (2015a),“ B → π(cid:96)(cid:96) form factors for new-physics searches fromlattice QCD,” Phys. Rev. Lett. (15), 152002,arXiv:1507.01618 [hep-ph].Bailey, Jon A, et al. (Fermilab Lattice, MILC) (2015b), “ | V ub | from B → π(cid:96)ν decays and (2+1)-ﬂavor lattice QCD,” Phys.Rev. D (1), 014024, arXiv:1503.07839 [hep-lat].Barate, R, et al. (ALEPH) (2001), “Measurements of B ( b → τ ¯ ν τ X ) and B ( b → τ ¯ ν τ D ∗ + X ) and upper limits on B ( b → τ ¯ ν τ ) and B ( b → sν ¯ ν ),” Eur. Phys. J. C , 213–227,arXiv:hep-ex/0010022.Bardhan, Debjyoti, and Diptimoy Ghosh (2019), “ B -mesoncharged current anomalies: The post-Moriond 2019 sta-tus,” Phys. Rev. D (1), 011701, arXiv:1904.10432 [hep-ph].B´ejar Alonso, I, O. Br¨uning, P. Fessia, M. Lamont, L. Rossi,L. Tavian, and M. Zerlauth (2020), High-Luminosity LargeHadron Collider (HL-LHC): Technical Design Report ,CERN Yellow Reports: Monographs (CERN, Geneva).Bernlochner, Florian U (2015), “ B → πτ ν τ decay in the con-text of type II 2HDM,” Phys. Rev. D (11), 115019,arXiv:1509.06938 [hep-ph].Bernlochner, Florian U, Stephan Duell, Zoltan Ligeti, MichelePapucci, and Dean J. Robinson (2020a), “Das ist derHAMMER: Consistent new physics interpretations ofsemileptonic decays,” arXiv:2002.00020 [hep-ph].Bernlochner, Florian U, and Zoltan Ligeti (2017), “Semilep-tonic B ( s ) decays to excited charmed mesons with e, µ, τ and searching for new physics with R ( D ∗∗ ),” Phys. Rev. D (1), 014022, arXiv:1606.09300 [hep-ph].Bernlochner, Florian U, Zoltan Ligeti, Michele Papucci,and Dean J. Robinson (2017), “Combined analysis ofsemileptonic B decays to D and D ∗ : R ( D ( ∗ ) ), | V cb | , andnew physics,” Phys. Rev. D (11), 115008, [Erratum:Phys.Rev.D 97, 059902 (2018)], arXiv:1703.05330 [hep-ph].Bernlochner, Florian U, Zoltan Ligeti, and Dean J. Robinson(2018a), “Model independent analysis of semileptonic B decays to D ∗∗ for arbitrary new physics,” Phys. Rev. D (7), 075011, arXiv:1711.03110 [hep-ph].Bernlochner, Florian U, Zoltan Ligeti, Dean J. Robinson, andWilliam L. Sutcliﬀe (2018b), “New predictions for Λ b → Λ c semileptonic decays and tests of heavy quark symmetry,”Phys. Rev. Lett. (20), 202001, arXiv:1808.09464 [hep-ph].Bernlochner, Florian Urs, Stephan Duell, Zoltan Ligeti,Michele Papucci, and Dean R Robinson (2020b), “HAM-MER - Helicity Amplitude Module for Matrix ElementReweighting,” .Bevan, A J, B. Golob, Th. Mannel, S. Prell, B. D. Yabsley,H. Aihara, F. Anulli, N. Arnaud, T. Aushev, M. Beneke,and et al. (2014), “The physics of the b factories,” The Eu-ropean Physical Journal C (11), 10.1140/epjc/s10052-014-3026-9.Bharucha, Aoife, David M. Straub, and Roman Zwicky (2016), “ B → V (cid:96) + (cid:96) − in the Standard Model from light-cone sum rules,” JHEP , 098, arXiv:1503.05534 [hep-ph].Bhattacharya, Bhubanjyoti, Alakabha Datta, David London,and Shanmuka Shivashankara (2015), “Simultaneous Ex-planation of the R K and R ( D ( ∗ ) ) Puzzles,” Phys. Lett. B , 370–374, arXiv:1412.7164 [hep-ph].Bigi, Dante, and Paolo Gambino (2016), “Revisiting B → D(cid:96)ν ,” Phys. Rev. D (9), 094008, arXiv:1606.08030 [hep-ph].Bigi, Dante, Paolo Gambino, and Stefan Schacht (2017),“ R ( D ∗ ), | V cb | , and the Heavy Quark Symmetry relationsbetween form factors,” JHEP , 061, arXiv:1707.09509[hep-ph].Biswas, Sandip, and Kirill Melnikov (2010), “Second orderQCD corrections to inclusive semileptonic B → X c (cid:96) ¯ ν (cid:96) de-cays with massless and massive lepton,” JHEP , 089,arXiv:0911.4142 [hep-ph].Boeckh, Tobias (2020), B-tagging with Deep Neural Networks ,Master’s thesis (Karlsruhe Institute of Technology (KIT)).B¨oer, Philipp, Marzia Bordone, Elena Graverini, PatrickOwen, Marcello Rotondo, and Danny Van Dyk (2018),“Testing lepton ﬂavour universality in semileptonic Λ b → Λ ∗ c decays,” JHEP , 155, arXiv:1801.08367 [hep-ph].Bordone, Marzia, Nico Gubernari, Martin Jung, and Dannyvan Dyk (2020), “Heavy-Quark Expansion for ¯ B s → D ( ∗ ) s Form Factors and Unitarity Bounds beyond the SU (3) F Limit,” Eur. Phys. J. C (4), 347, arXiv:1912.09335 [hep-ph].Bourrely, Claude, Irinel Caprini, and Laurent Lellouch(2009), “Model-independent description of B → π(cid:96) ¯ ν (cid:96) decays and a determination of | V ub | ,” Phys. Rev. D , 013008, [Erratum: Phys.Rev.D 82, 099902 (2010)],arXiv:0807.2722 [hep-ph].Boyd, CGlenn, Benjamin Grinstein, and Richard F. Lebed(1996), “Model independent determinations of ¯ B → D ∗ (cid:96) ¯ ν (cid:96) form-factors,” Nucl. Phys. B , 493–511, arXiv:hep-ph/9508211.Boyd, CGlenn, Benjamin Grinstein, and Richard F. Lebed(1997), “Precision corrections to dispersive bounds onform-factors,” Phys. Rev. D , 6895–6911, arXiv:hep-ph/9705252.Bozek, A, et al. (Belle) (2010), “Observation of B + → ¯ D ∗ τ + ν τ and Evidence for B + → ¯ D τ + ν τ at Belle,”Phys. Rev. D , 072005, arXiv:1005.2302 [hep-ex].Buttazzo, Dario, Admir Greljo, Gino Isidori, and David Mar-zocca (2017), “B-physics anomalies: a guide to combinedexplanations,” JHEP , 044, arXiv:1706.07808 [hep-ph].Calibbi, Lorenzo, Andreas Crivellin, and Toshihiko Ota(2015), “Eﬀective Field Theory Approach to b → s(cid:96)(cid:96) ( (cid:48) ) , B → K ( ∗ ) νν and B → D ( ∗ ) τ ν with Third Generation Cou-plings,” Phys. Rev. Lett. , 181801, arXiv:1506.02661[hep-ph].Caprini, Irinel, Laurent Lellouch, and Matthias Neubert(1998), “Dispersive bounds on the shape of ¯ B → D ( ∗ ) (cid:96) ¯ ν form-factors,” Nucl. Phys. B530 , 153–181, arXiv:hep-ph/9712417 [hep-ph].Caria, G, et al. (Belle) (2020), “Measurement of R ( D ) and R ( D ∗ ) with a semileptonic tagging method,” Phys. Rev.Lett. (16), 161803, arXiv:1910.05864 [hep-ex].Caria, Giacomo (2019), “Measurement of R ( D ) and R ( D ∗ )with a semileptonic tag at the Belle experiment,” PhD The-sis, University of Melbourne .Cerri, A, et al. (2019), “Report from Working Group 4: eferences Opportunities in Flavour Physics at the HL-LHC andHE-LHC,” CERN Yellow Rep. Monogr. , 867–1158,arXiv:1812.07638 [hep-ph].Ciezarek, Gregory, Manuel Franco Sevilla, Brian Hamilton,Robert Kowalewski, Thomas Kuhr, Vera L¨uth, and Yu-taro Sato (2017), “A Challenge to Lepton Universality inB Meson Decays,” Nature , 227–233, arXiv:1703.01766[hep-ex].Cohen, Thomas D, Henry Lamm, and Richard F. Lebed(2019), “Precision Model-Independent Bounds from GlobalAnalysis of b → c(cid:96)ν Form Factors,” Phys. Rev. D (9),094503, arXiv:1909.10691 [hep-ph].Colquhoun, B, C.T.H. Davies, R.J. Dowdall, J. Kettle,J. Koponen, G.P. Lepage, and A.T. Lytle (HPQCD)(2015), “B-meson decay constants: a more complete pic-ture from full lattice QCD,” Phys. Rev. D (11), 114509,arXiv:1503.05762 [hep-lat].Colquhoun, Brian, Christine Davies, Jonna Koponen, AndrewLytle, and Craig McNeile (HPQCD) (2016), “ B c decaysfrom highly improved staggered quarks and NRQCD,” PoS LATTICE2016 , 281, arXiv:1611.01987 [hep-lat].Cowan, Glen (2019), “Statistical models with uncertain errorparameters,” Eur. Phys. J. C (2), 133, arXiv:1809.05778[physics.data-an].Davier, M, L. Duﬂot, F. Le Diberder, and A. Rouge (1993),“The Optimal method for the measurement of tau polar-ization,” Phys. Lett. B , 411–417.Detmold, William, Christoph Lehner, and Stefan Meinel(2015), “Λ b → p(cid:96) − ¯ ν (cid:96) and Λ b → Λ c (cid:96) − ¯ ν (cid:96) form factors fromlattice QCD with relativistic heavy quarks,” Phys. Rev. D (3), 034503, arXiv:1503.01421 [hep-lat].Dong, Mingyi, et al. (The CEPC Study Group) (2018a),“CEPC Conceptual Design Report: Volume 1 - Acceler-ator,” arXiv:1809.00285 [physics.acc-ph].Dong, Mingyi, et al. (The CEPC Study Group) (2018b),“CEPC Conceptual Design Report: Volume 2 - Physics& Detector,” arXiv:1811.10545 [hep-ex].Ecker, G, J. Gasser, H. Leutwyler, A. Pich, and E. de Rafael(1989a), “Chiral Lagrangians for Massive Spin 1 Fields,”Phys. Lett. B , 425–432.Ecker, G, J. Gasser, A. Pich, and E. de Rafael (1989b), “TheRole of Resonances in Chiral Perturbation Theory,” Nucl.Phys. B , 311–342.Eichten, Estia, and Brian Russell Hill (1990), “An EﬀectiveField Theory for the Calculation of Matrix Elements In-volving Heavy Quarks,” Phys. Lett. B234 , 511–516.Erdmann, Martin, Jonas Glombitza, and Thorben Quast(2019), “Precise simulation of electromagnetic calorimetershowers using a Wasserstein Generative Adversarial Net-work,” Comput. Softw. Big Sci. (1), 4, arXiv:1807.01954[physics.ins-det].Fajfer, S, J. F. Kamenik, and I. Nisandzic (2012), “On the B → D ∗ τ ¯ ν τ sensitivity to new physics,” Phys. Rev. D ,094025, arXiv:1203.2654 [hep-ph].Feindt, M, et al. (2011), “A hierarchical neuroBayes-based al-gorithm for full reconstruction of B mesons at B factories,”Nucl. Instrum. Meth. A654 , 432–440, arXiv:1102.3876[hep-ex].Franco Sevilla, Manuel (2012), “Evidence for an excess of B → D ( ∗ ) τ ν decays,” PhD Thesis, Stanford University .Freytsis, M, Z. Ligeti, and J. T. Ruderman (2015), “Fla-vor models for ¯ B → D ( ∗ ) τ ¯ ν ,” Phys. Rev. D , 054018,arXiv:1506.08896 [hep-ph]. Georgi, Howard (1990), “An Eﬀective Field Theory for HeavyQuarks at Low-energies,” Phys. Lett. B240 , 447–450.Glattauer, R, et al. (Belle) (2016), “Measurement of the decay B → D(cid:96)ν (cid:96) in fully reconstructed events and determinationof the Cabibbo-Kobayashi-Maskawa matrix element | V cb | ,”Phys. Rev. D (3), 032006, arXiv:1510.03657 [hep-ex].Gonz´alez-Alonso, Mart´ın, Jorge Martin Camalich, and KinMimouni (2017), “Renormalization-group evolution of newphysics contributions to (semi)leptonic meson decays,”Phys. Lett. B , 777–785, arXiv:1706.00410 [hep-ph].Greljo, Admir, Jorge Martin Camalich, and Jos´e DavidRuiz- ´Alvarez (2019), “Mono- τ Signatures at the LHC Con-strain Explanations of B -decay Anomalies,” Phys. Rev.Lett. (13), 131803, arXiv:1811.07920 [hep-ph].Greljo, Admir, and David Marzocca (2017), “High- p T dilep-ton tails and ﬂavor physics,” Eur. Phys. J. C (8), 548,arXiv:1704.09015 [hep-ph].Hamer, P, et al. (Belle) (2016), “Search for B → π − τ + ν τ with hadronic tagging at Belle,” Phys. Rev. D93 (3),032007, arXiv:1509.06521 [hep-ex].Harrison, Judd, Christine T.H. Davies, and Andrew Lytle(2020a), “ B c → J/ψ

Form Factors for the full q rangefrom Lattice QCD,” arXiv:2007.06957 [hep-lat].Harrison, Judd, Christine T.H. Davies, and Andrew Ly-tle (LATTICE-HPQCD) (2020b), “ R ( J/ψ ) and B − c → J/ψ(cid:96) − ¯ ν (cid:96) Lepton Flavor Universality Violating Observablesfrom Lattice QCD,” arXiv:2007.06956 [hep-lat].Hasenbusch, Jan (2018), “Analysis of inclusive semileptonicb meson decays with τ lepton ﬁnal states at the belle ex-periment,” PhD Thesis, University of Bonn .Herb, SW, et al. (1977), “Observation of a Dimuon Resonanceat 9.5-GeV in 400-GeV Proton-Nucleus Collisions,” Phys.Rev. Lett. , 252–255.Hill, Donal, Malcolm John, Wenqi Ke, and Anton Poluektov(2019), “Model-independent method for measuring the an-gular coeﬃcients of B → D ∗ τ ν decays,” Journal of HighEnergy Physics (11), 10.1007/jhep11(2019)133.Hirose, S, et al. (Belle) (2017), “Measurement of the τ leptonpolarization and R ( D ∗ ) in the decay ¯ B → D ∗ τ − ¯ ν τ ,” Phys.Rev. Lett. (21), 211801, arXiv:1612.00529 [hep-ex].Hirose, S, et al. (Belle) (2018), “Measurement of the τ leptonpolarization and R ( D ∗ ) in the decay ¯ B → D ∗ τ − ¯ ν τ withone-prong hadronic τ decays at Belle,” Phys. Rev. D (1),012004, arXiv:1709.00129 [hep-ex].Huang, Zhuo-Ran, Ying Li, Cai-Dian Lu, M. Ali Paracha,and Chao Wang (2018), “Footprints of New Physics in b → cτ ν Transitions,” Phys. Rev. D (9), 095018,arXiv:1808.03565 [hep-ph].Huschle, M, et al. (Belle) (2015), “Measurement of thebranching ratio of ¯ B → D ( ∗ ) τ − ¯ ν τ relative to ¯ B → D ( ∗ ) (cid:96) − ¯ ν (cid:96) decays with hadronic tagging at Belle,” Phys. Rev. D ,072014, arXiv:1507.03233 [hep-ex].Huschle, Matthias (2015), “Measurement of the branching ra-tio of B → D ( ∗ ) τ ν τ relative to B → D ( ∗ ) (cid:96)ν (cid:96) decays withhadronic tagging at Belle,” PhD Thesis, Karlsruhe Insti-tute of Technology (KIT) .Isgur, Nathan, Daryl Scora, Benjamin Grinstein, andMark B. Wise (1989), “Semileptonic B and D Decays inthe Quark Model,” Phys. Rev. D39 , 799–818.Isgur, Nathan, and Mark B. Wise (1989), “Weak Decays ofHeavy Mesons in the Static Quark Approximation,” Phys.Lett.

B232 , 113–117.Isgur, Nathan, and Mark B. Wise (1990), “Weak Transition eferences Form-factors Between Heavy Mesons,” Phys. Lett.

B237 ,527–530.Jaiswal, Sneha, Soumitra Nandi, and Sunando Kumar Patra(2017), “Extraction of | V cb | from B → D ( ∗ ) (cid:96)ν (cid:96) and theStandard Model predictions of R ( D ( ∗ ) ),” JHEP , 060,arXiv:1707.09977 [hep-ph].Jaiswal, Sneha, Soumitra Nandi, and Sunando Kumar Patra(2020), “Updates on SM predictions of | V cb | and R ( D ∗ ) in B → D ∗ (cid:96)ν (cid:96) decays,” arXiv:2002.05726 [hep-ph].Jung, Martin, and David M. Straub (2019), “Constrain-ing new physics in b → c(cid:96)ν transitions,” JHEP , 009,arXiv:1801.01112 [hep-ph].Kahn, James (2019), “Hadronic tag sensitivity study of b → k ( ∗ ) ν ¯ ν and selective background monte carlo simulationat belle ii,” PhD Thesis, Ludwig-Maximilians-Universit¨atM¨unchen .Kamenik, J F, S. Monteil, A. Semkiv, and L. Vale Silva(2017), “Lepton polarization asymmetries in rare semi-tauonic b → s exclusive decays at fcc-ee,” The EuropeanPhysical Journal C (10), 10.1140/epjc/s10052-017-5272-0.Keck, T, et al. (2019), “The Full Event Interpretation,” Com-put. Softw. Big Sci. (1), 6, arXiv:1807.08680 [hep-ex].Keck, Thomas (2017), Machine learning algorithms for theBelle II experiment and their validation on Belle data ,Ph.D. thesis (Karlsruhe Institute of Technology (KIT)).Kumar, Jacky, David London, and Ryoutaro Watanabe(2019), “Combined Explanations of the b → sµ + µ − and b → cτ − ¯ ν Anomalies: a General Model Analysis,” Phys.Rev. D (1), 015007, arXiv:1806.07403 [hep-ph].Lees, J P, et al. (BaBar) (2012), “Evidence for an excess of¯ B → D ( ∗ ) τ − ¯ ν τ decays,” Phys. Rev. Lett. , 101802,arXiv:1205.5442 [hep-ex].Lees, J P, et al. (BaBar) (2013), “Measurement of an excess of¯ B → D ( ∗ ) τ − ¯ ν τ decays and Implications for charged Higgsbosons,” Phys. Rev. D , 072012, arXiv:1303.0571 [hep-ex].Leibovich, Adam K, Zoltan Ligeti, Iain W. Stewart, andMark B. Wise (1997), “Predictions for B → D (2420) (cid:96) ¯ ν and B → D ∗ (2460) (cid:96) ¯ ν at order Λ QCD /m c,b ,” Phys. Rev.Lett. , 3995–3998, arXiv:hep-ph/9703213.Leibovich, Adam K, Zoltan Ligeti, Iain W. Stewart, andMark B. Wise (1998), “Semileptonic B decays to excitedcharmed mesons,” Phys. Rev. D , 308–330, arXiv:hep-ph/9705467.Leibovich, Adam K, and Iain W. Stewart (1998), “Semilep-tonic Lambda(b) decay to excited Lambda(c) baryons atorder Lambda(QCD) / m(Q),” Phys. Rev. D , 5620–5631, arXiv:hep-ph/9711257.LHCb Collaboration, (2020), “Amplitude analysis of B + → D ∗− π + D + s ,” In preparation.Li, Xin-Qiang, Ya-Dong Yang, and Xin Zhang (2016), “Re-visiting the one leptoquark solution to the R(D ( ∗ ) ) anoma-lies and its phenomenological implications,” JHEP , 054,arXiv:1605.09308 [hep-ph].Ligeti, Zoltan, Yosef Nir, and Matthias Neubert (1994), “TheSubleading Isgur-Wise form-factor ξ ( v · v (cid:48) ) and its implica-tions for the decays ¯ B → D ∗ (cid:96) ¯ ν (cid:96) ,” Phys. Rev. D49 , 1302–1309, arXiv:hep-ph/9305304 [hep-ph].Ligeti, Zoltan, and Frank J. Tackmann (2014), “Precise pre-dictions for B → X c τ ¯ ν decay distributions,” Phys. Rev. D (3), 034021, arXiv:1406.7013 [hep-ph].Mangano, Michelangelo, et al. (2018), “FCC Physics Oppor- tunities: Future Circular Collider Conceptual Design Re-port Volume 1. Future Circular Collider,” (CERN-ACC-2018-0056. 6), 10.1140/epjc/s10052-019-6904-3.Matyja, A, et al. (Belle) (2007), “Observation of B0 —¿ D*-tau+ nu(tau) decay at Belle,” Phys. Rev. Lett. , 191807,arXiv:0706.4429 [hep-ex].McLean, E, C.T.H. Davies, J. Koponen, and A.T. Lytle(2020), “ B s → D s (cid:96)ν Form Factors for the full q rangefrom Lattice QCD with non-perturbatively normalized cur-rents,” Phys. Rev. D (7), 074513, arXiv:1906.00701[hep-lat].McLean, E, C.T.H. Davies, A.T. Lytle, and J. Koponen(2019), “Lattice QCD form factor for B s → D ∗ s lν at zero re-coil with non-perturbative current renormalisation,” Phys.Rev. D (11), 114512, arXiv:1904.02046 [hep-lat].Neubert, Matthias (1994), “Heavy quark symmetry,” Phys.Rept. , 259–396, arXiv:hep-ph/9306320.Neubert, Matthias, Zoltan Ligeti, and Yosef Nir (1993a),“QCD sum rule analysis of the subleading Isgur-Wise form-factor χ ( v · v (cid:48) ),” Phys. Lett. B301 , 101–107, arXiv:hep-ph/9209271 [hep-ph].Neubert, Matthias, Zoltan Ligeti, and Yosef Nir (1993b),“The Subleading Isgur-Wise form-factor χ ( v · v (cid:48) ) to orderalpha-s in QCD sum rules,” Phys. Rev. D47 , 5060–5066,arXiv:hep-ph/9212266 [hep-ph].Nugent, I M, T. Przedzi´nski, P. Roig, O. Shekhovtsova, andZ. Wa¸s (2013), “Resonance chiral lagrangian currents andexperimental data for τ − → π − π − π + ν τ ,” Phys. Rev. D , 093012.Penalva, N, E. Hern´andez, and J. Nieves (2020), “ ¯ B c → η c ,¯ B c → J/ψ and ¯ B → D ( ∗ ) semileptonic decays includingnew physics,” arXiv:2007.12590 [hep-ph].Pervin, Muslema, Winston Roberts, and Simon Capstick(2005), “Semileptonic decays of heavy lambda baryons ina quark model,” Phys. Rev. C , 035201, arXiv:nucl-th/0503030.Prim, Markus Tobias, Florian Urs Bernlochner, and Dean J.Robinson (2020), “Precision predictions for B → ρτ ν τ and B → ωτ ν τ in the SM and beyond,” PoS EPS-HEP2019 ,250, arXiv:2001.06170 [hep-ph].Sakaki, Y, A. Tanaka, M. Tayduganov, and R. Watan-abe (2013), “Testing leptoquark models in ¯ B → D ( ∗ ) τ ¯ ν ,”Phys. Rev. D , 094012, arXiv:1309.0301 [hep-ph].Sato, Y, et al. (Belle) (2016), “Measurement of the branchingratio of ¯ B → D ∗ + τ − ¯ ν τ relative to ¯ B → D ∗ + (cid:96) − ¯ ν (cid:96) de-cays with semileptonic tagging,” Phys. Rev. D , 072007,arXiv:1607.07923 [hep-ex].Scora, Daryl, and Nathan Isgur (1995), “Semileptonic mesondecays in the quark model: An update,” Phys. Rev. D52 ,2783–2812, arXiv:hep-ph/9503486 [hep-ph].Shekhovtsova, O, T. Przedzinski, P. Roig, and Z. Was (2012),“Resonance chiral Lagrangian currents and τ decay MonteCarlo,” Phys. Rev. D , 113008, arXiv:1203.3955 [hep-ph].Sirlin, A (1982), “Large m(W), m(Z) Behavior of the O(alpha)Corrections to Semileptonic Processes Mediated by W,”Nucl. Phys. B , 83–92.Sj¨ostrand, Torbj¨orn, Stefan Ask, Jesper R. Christiansen,Richard Corke, Nishita Desai, Philip Ilten, StephenMrenna, Stefan Prestel, Christine O. Rasmussen, and Pe-ter Z. Skands (2015), “An introduction to PYTHIA 8.2,”Computer Physics Communications , 159–177.Tanaka, Minoru, and Ryoutaro Watanabe (2013), “Newphysics in the weak interaction of B → D ( ∗ ) τ ν ,” Phys. eferences Rev. D , 034028.Tsaklidis, Ilias (2020), “Demonstrating learned particle de-cay reconstruction using graph neural networks at belle ii,”MSc Thesis, Strasbourg, Universit`e de Strasbourg .Vallecorsa, S (2018), “Generative models for fast simulation,”Journal of Physics: Conference Series , 022005.Waheed, E, et al. (Belle) (2019), “Measurement of the CKMmatrix element | V cb | from B → D ∗− (cid:96) + ν (cid:96) at Belle,” Phys. Rev. D (5), 052007, arXiv:1809.03290 [hep-ex].Zheng, Taifan, Ji Xu, Lu Cao, Dan Yu, Wei Wang, SoerenPrell, Yeuk-Kwan E. Cheung, and Manqi Ruan (2020),“Analysis of B c → τ ν τ at CEPC,” arXiv:2007.08234 [hep-ex].Zyla, PA, et al. (Particle Data Group) (2020), “Review ofParticle Physics,” PTEP2020