Search for B+ -> l+ nu gamma decays with hadronic tagging using the full Belle data sample
A. Heller, P. Goldenzweig, M. Heck, T. Kuhr, A. Zupanc, A. Abdesselam, I. Adachi, K. Adamczyk, H. Aihara, K. Arinstein, D. M. Asner, V. Aulchenko, T. Aushev, R. Ayad, V. Babu, I. Badhrees, A. M. Bakich, E. Barberio, V. Bhardwaj, B. Bhuyan, A. Bondar, G. Bonvicini, A. Bozek, M. Bračko, T. E. Browder, D. Červenkov, A. Chen, B. G. Cheon, K. Cho, V. Chobanova, Y. Choi, D. Cinabro, M. Danilov, Z. Doležal, Z. Drásal, A. Drutskoy, D. Dutta, S. Eidelman, H. Farhat, J. E. Fast, M. Feindt, T. Ferber, B. G. Fulsom, V. Gaur, N. Gabyshev, A. Garmash, D. Getzkow, R. Gillard, R. Glattauer, Y. M. Goh, B. Golob, J. Haba, K. Hayasaka, X. H. He, T. Horiguchi, W.-S. Hou, T. Iijima, K. Inami, A. Ishikawa, R. Itoh, Y. Iwasaki, I. Jaegle, D. Joffe, E. Kato, P. Katrenko, T. Kawasaki, D. Y. Kim, H. J. Kim, J. B. Kim, J. H. Kim, K. T. Kim, S. H. Kim, K. Kinoshita, B. R. Ko, P. Kodyš, S. Korpar, P. Križan, P. Krokovny, T. Kumita, A. Kuzmin, Y.-J. Kwon, J. S. Lange, I. S. Lee, P. Lewis, J. Libby, P. Lukin, D. Matvienko, K. Miyabayashi, H. Miyata, R. Mizuk, G. B. Mohanty, A. Moll, H. K. Moon, R. Mussa, E. Nakano, M. Nakao, T. Nanut, Z. Natkaniec, M. Nayak, N. K. Nisar, et al. (64 additional authors not shown)
BBelle Preprint 2015-6KEK Preprint 2015-3
Search for B + → (cid:96) + ν (cid:96) γ decays with hadronic tagging using the full Belle data sample A. Heller, P. Goldenzweig, M. Heck, T. Kuhr, A. Zupanc, A. Abdesselam, I. Adachi,
12, 9
K. Adamczyk, H. Aihara, K. Arinstein, D. M. Asner, V. Aulchenko, T. Aushev,
36, 20
R. Ayad, V. Babu, I. Badhrees,
55, 25
A. M. Bakich, E. Barberio, V. Bhardwaj, B. Bhuyan, J. Biswal, A. Bondar, G. Bonvicini, A. Bozek, M. Braˇcko,
32, 21
T. E. Browder, D. ˇCervenkov, A. Chen, B. G. Cheon, K. Cho, V. Chobanova, Y. Choi, D. Cinabro, M. Danilov,
20, 35
Z. Doleˇzal, Z. Dr´asal, A. Drutskoy,
20, 35
D. Dutta, S. Eidelman, H. Farhat, J. E. Fast, M. Feindt, T. Ferber, B. G. Fulsom, V. Gaur, N. Gabyshev, A. Garmash, D. Getzkow, R. Gillard, R. Glattauer, Y. M. Goh, B. Golob,
30, 21
J. Grygier, J. Haba,
12, 9
K. Hayasaka, X. H. He, M. Heider, T. Horiguchi, W.-S. Hou, M. Huschle, T. Iijima,
38, 37
K. Inami, A. Ishikawa, R. Itoh,
12, 9
Y. Iwasaki, I. Jaegle, D. Joffe, E. Kato, P. Katrenko, T. Kawasaki, D. Y. Kim, H. J. Kim, J. B. Kim, J. H. Kim, K. T. Kim, S. H. Kim, K. Kinoshita, B. R. Ko, P. Kodyˇs, S. Korpar,
32, 21
P. Kriˇzan,
30, 21
P. Krokovny, T. Kumita, A. Kuzmin, Y.-J. Kwon, J. S. Lange, I. S. Lee, P. Lewis, J. Libby, P. Lukin, D. Matvienko, K. Miyabayashi, H. Miyata, R. Mizuk,
20, 35
G. B. Mohanty, A. Moll,
33, 57
H. K. Moon, R. Mussa, E. Nakano, M. Nakao,
12, 9
T. Nanut, Z. Natkaniec, M. Nayak, N. K. Nisar, S. Nishida,
12, 9
S. Ogawa, S. Okuno, S. L. Olsen, C. Oswald, G. Pakhlova,
36, 20
B. Pal, H. Park, T. K. Pedlar, L. Pes´antez, R. Pestotnik, M. Petriˇc, L. E. Piilonen, C. Pulvermacher, E. Ribeˇzl, M. Ritter, A. Rostomyan, S. Ryu, Y. Sakai,
12, 9
S. Sandilya, L. Santelj, T. Sanuki, Y. Sato, V. Savinov, O. Schneider, G. Schnell,
1, 13
C. Schwanda, K. Senyo, M. E. Sevior, M. Shapkin, V. Shebalin, C. P. Shen, T.-A. Shibata, J.-G. Shiu, B. Shwartz, A. Sibidanov, F. Simon,
33, 57
Y.-S. Sohn, A. Sokolov, M. Stariˇc, M. Steder, J. Stypula, U. Tamponi,
19, 63
Y. Teramoto, K. Trabelsi,
12, 9
M. Uchida, T. Uglov,
20, 36
Y. Unno, S. Uno,
12, 9
P. Urquijo, C. Van Hulse, P. Vanhoefer, G. Varner, A. Vinokurova, A. Vossen, M. N. Wagner, M.-Z. Wang, X. L. Wang, Y. Watanabe, K. M. Williams, E. Won, Y. Yamashita, S. Yashchenko, Z. P. Zhang, and V. Zhilich (The Belle Collaboration) University of the Basque Country UPV/EHU, 48080 Bilbao Beihang University, Beijing 100191 University of Bonn, 53115 Bonn Budker Institute of Nuclear Physics SB RAS and Novosibirsk State University, Novosibirsk 630090 Faculty of Mathematics and Physics, Charles University, 121 16 Prague University of Cincinnati, Cincinnati, Ohio 45221 Deutsches Elektronen–Synchrotron, 22607 Hamburg Justus-Liebig-Universit¨at Gießen, 35392 Gießen SOKENDAI (The Graduate University for Advanced Studies), Hayama 240-0193 Hanyang University, Seoul 133-791 University of Hawaii, Honolulu, Hawaii 96822 High Energy Accelerator Research Organization (KEK), Tsukuba 305-0801 IKERBASQUE, Basque Foundation for Science, 48013 Bilbao Indian Institute of Technology Guwahati, Assam 781039 Indian Institute of Technology Madras, Chennai 600036 Indiana University, Bloomington, Indiana 47408 Institute of High Energy Physics, Vienna 1050 Institute for High Energy Physics, Protvino 142281 INFN - Sezione di Torino, 10125 Torino Institute for Theoretical and Experimental Physics, Moscow 117218 J. Stefan Institute, 1000 Ljubljana Kanagawa University, Yokohama 221-8686 Institut f¨ur Experimentelle Kernphysik, Karlsruher Institut f¨ur Technologie, 76131 Karlsruhe Kennesaw State University, Kennesaw GA 30144 King Abdulaziz City for Science and Technology, Riyadh 11442 Korea Institute of Science and Technology Information, Daejeon 305-806 a r X i v : . [ h e p - e x ] A p r Korea University, Seoul 136-713 Kyungpook National University, Daegu 702-701 ´Ecole Polytechnique F´ed´erale de Lausanne (EPFL), Lausanne 1015 Faculty of Mathematics and Physics, University of Ljubljana, 1000 Ljubljana Luther College, Decorah, Iowa 52101 University of Maribor, 2000 Maribor Max-Planck-Institut f¨ur Physik, 80805 M¨unchen School of Physics, University of Melbourne, Victoria 3010 Moscow Physical Engineering Institute, Moscow 115409 Moscow Institute of Physics and Technology, Moscow Region 141700 Graduate School of Science, Nagoya University, Nagoya 464-8602 Kobayashi-Maskawa Institute, Nagoya University, Nagoya 464-8602 Nara Women’s University, Nara 630-8506 National Central University, Chung-li 32054 Department of Physics, National Taiwan University, Taipei 10617 H. Niewodniczanski Institute of Nuclear Physics, Krakow 31-342 Nippon Dental University, Niigata 951-8580 Niigata University, Niigata 950-2181 Osaka City University, Osaka 558-8585 Pacific Northwest National Laboratory, Richland, Washington 99352 Peking University, Beijing 100871 University of Pittsburgh, Pittsburgh, Pennsylvania 15260 University of Science and Technology of China, Hefei 230026 Seoul National University, Seoul 151-742 Soongsil University, Seoul 156-743 University of South Carolina, Columbia, South Carolina 29208 Sungkyunkwan University, Suwon 440-746 School of Physics, University of Sydney, NSW 2006 Department of Physics, Faculty of Science, University of Tabuk, Tabuk 71451 Tata Institute of Fundamental Research, Mumbai 400005 Excellence Cluster Universe, Technische Universit¨at M¨unchen, 85748 Garching Toho University, Funabashi 274-8510 Tohoku University, Sendai 980-8578 Department of Physics, University of Tokyo, Tokyo 113-0033 Tokyo Institute of Technology, Tokyo 152-8550 Tokyo Metropolitan University, Tokyo 192-0397 University of Torino, 10124 Torino CNP, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061 Wayne State University, Detroit, Michigan 48202 Yamagata University, Yamagata 990-8560 Yonsei University, Seoul 120-749 bstract We search for the decay B + → (cid:96) + ν (cid:96) γ with (cid:96) + = e + or µ + using the full Belle data set of 772 × B ¯ B pairs, collected atthe Υ(4S) resonance with the Belle detector at the KEKB asymmetric-energy e + e − collider. We reconstruct one B meson ina hadronic decay mode and search for the B + → (cid:96) + ν (cid:96) γ decay in the remainder of the event. We observe no significant signalwithin the phase space of E sig γ > B ( B + → e + ν e γ ) < . × − , B ( B + → µ + ν µ γ ) < . × − ,and B ( B + → (cid:96) + ν (cid:96) γ ) < . × − at 90% credibility level. PACS numbers: 13.20.He, 14.40.Nd
I. INTRODUCTION
The semileptonic decay B + → (cid:96) + ν (cid:96) γ [2] proceeds viaa ¯ bu annihilation into a W + boson that decays into alepton-neutrino pair. This is accompanied by a photonemission from one of the participating charged particleswith emission from the up quark being the dominant con-tribution. The decay can be computed in Heavy QuarkEffective Theory [3], which is valid for a high energeticphoton emission above the QCD scale of E γ (cid:29) Λ QCD .The resulting decay amplitude depends on the first in-verse moment λ − B = (cid:82) ∞ dω Φ B + ( ω ) /ω , where Φ B + ( ω ) isthe B meson light-cone distribution amplitude in the highenergy limit. This parameter is an important input tothe QCD factorization scheme used in non-leptonic B decay amplitudes [4]; a tighter limit on — or, a fortiori ,a measurement of λ B would improve the predictions forall of these processes. To produce consistent results forcolor-suppressed modes in non-leptonic B decays, valuesof roughly λ B ≈
200 MeV are needed. The parametercannot be calculated reliably by theory and thus has tobe measured experimentally. The decay discussed in thisLetter is advantageous since no additional unknown pa-rameters are needed for its calculation in leading order.The branching fraction of the decay B + → (cid:96) + ν (cid:96) γ isexpected to be larger than that of the purely leptonic B + → (cid:96) + ν (cid:96) decay as the photon removes the helicitysuppression of the process, thus enhancing the weak de-cay amplitude. This effect is diminished by the additionalelectromagnetic coupling introduced by the photon emis-sion. The B + → (cid:96) + ν (cid:96) γ decay has been calculated up tofirst-order corrections in 1 /m b and radiative correctionsat next-to-leading logarithmic order [3]. The differentialbranching fraction is given by d Γ dE γ = α em G F | V ub | π m B (1 − x γ ) x γ (cid:104) F A + F V (cid:105) , (1)with x γ = 2 E γ /m B . Here, m B is the B meson mass, G F the Fermi coupling constant, V ub the CKM matrix ele-ment, and F A and F V the axial and vector form factors, respectively. The form factors are given by F V ( E γ ) = Q u m B f B E γ λ B ( µ ) R ( E γ , µ )+ (cid:104) ξ ( E γ ) + Q u m B f B (2 E γ ) + Q b m B f B E γ m b (cid:105) ,F A ( E γ ) = Q u m B f B E γ λ B ( µ ) R ( E γ , µ )+ (cid:104) ξ ( E γ ) − Q u m B f B (2 E γ ) − Q b m B f B E γ m b + Q (cid:96) f B E γ (cid:105) , where Q (cid:96),u,b are the charges of the lepton, up quark, andbottom quark, respectively, f B is the decay constant forthe B meson, and R ( E γ , µ ) is the radiative correctioncalculated at the energy scale µ . The first term in theform factors containing λ B represents the leading-ordercontribution of the QCD heavy-quark expansion describ-ing the photon emission by the light quark. The lead-ing order term is corrected for higher-order radiative ef-fects, with the R ( E γ , µ ) factor containing mass correc-tions for the up quark. The remaining terms in squarebrackets are 1 /m b power corrections which are: higher-order contributions for the hard and soft photon emissionof the up quark ( Q u and the ξ ( E γ )-term, respectively);the photon emission by the b quark, which is suppresseddue to its higher mass ( Q b -term); and the photon emis-sion by the lepton, which is only present in the axialform factor ( Q (cid:96) -term). The radiative corrections con-tained in R ( E γ , µ ) reduce the leading-order amplitudeby about 20 − /m b power correc-tions have considerable parametric uncertainties. How-ever, using central values for the parameters the power-suppressed terms reduce the decay amplitude by abouthalf the amount of the radiative corrections. The soft cor-rection for the light quark ξ ( E γ ) constitutes the largestuncertainty in the form factors and it has been calculatedin Ref. [5] to a higher precision.The most stringent limits for the decay process havebeen reported by the BaBar collaboration [6] at 90%confidence level with B ( B + → e + ν e γ ) < × − , B ( B + → µ + ν µ γ ) < × − , B ( B + → (cid:96) + ν (cid:96) γ ) < . × − , and a partial branching fraction∆ B ( B + → (cid:96) + ν (cid:96) γ ) < × − for photons with en-ergies higher than 1 GeV. For the preferred value of λ B ≈
200 MeV, a Standard Model branching fraction of B ( B + → (cid:96) + ν (cid:96) γ ) ≈ O (10 − ) is expected [3].3 I. DATA SAMPLE AND SIMULATION
This study uses a sample of (771 . ± . × B ¯ B pairs, which corresponds to an integrated luminosity of711 fb − collected with the Belle detector at the KEKBasymmetric-energy e + e − collider [7]. The collider oper-ates at the Υ(4S) resonance with a center-of-mass energyof 10.58 GeV/ c , where the resonance decays almost ex-clusively to B ¯ B pairs.The Belle detector is a large-solid-angle magnetic spec-trometer that consists of a silicon vertex detector, a 50-layer central drift chamber (CDC), an array of aerogelthreshold Cherenkov counters (ACC), a barrel-like ar-rangement of time-of-flight scintillation counters (TOF),and an electromagnetic calorimeter (ECL) comprisingCsI crystals located inside a superconducting solenoidcoil that provides a 1.5 T magnetic field. An iron fluxreturn located outside the coil is instrumented to detect K mesons and to identify muons (KLM). A detaileddescription of the Belle detector can be found in Ref. [8].The analysis procedure is determined using MonteCarlo (MC) samples that are simulated with the Evt-Gen software package [9] followed by detector simula-tion performed with GEANT3 [10]. Beam background isrecorded by the experiment and added to each event inthe simulated MC. Samples of 2 × events are gener-ated for each signal MC channel, where the latest theo-retical calculation [3] is implemented as a decay model inEvtGen. Different samples with high integrated luminos-ity are used to estimate the background. A MC samplecontaining resonant charmed B ¯ B events with b → c de-cays contains ten times the integrated luminosity of thedata sample. Non-resonant e + e − → q ¯ q ( q = u, d, s, c )continuum processes are included in a MC sample withsix times the integrated luminosity of the data sample. Asemileptonic b → u(cid:96) + ν (cid:96) sample with 20 times the statis-tics of the data contains the important background pro-cesses of B + → (cid:96) + ν (cid:96) π and B + → (cid:96) + ν (cid:96) η . For the lattertwo processes, high statistics MC is produced with about100 times the size of the data sample. A final sample con-tains rare b → s transitions and additional processes with50 times the integrated data luminosity. III. HADRONIC B -TAGGING As the neutrino of the signal decay cannot be de-tected, the full reconstruction technique provides strongconstraints on the kinematics of the signal decay. Thehadronic full reconstruction at Belle is a hierarchical re-construction scheme of one of the two B mesons (tag-side B tag meson) [11] in the event.The charged B tag meson candidate is reconstructed inone of 17 final states: ¯ D ( ∗ ) X h ad (7 states), ¯ D ( ∗ )0 D ( ∗ )+ s (4 states), ¯ D K + , D − π + π + , J/ψK + ( π , π + π − ), and J/ψK S π + , where X h ad is a set of selected states withone to four pions, of which one can be neutral. The J/ψ particles are reconstructed from e + e − or µ + µ − decays. Two charged tracks are used to reconstruct a K S can-didate whose mass must be within a 30 MeV/ c windowaround the nominal K S mass. Neutral pions are recon-structed from pairs of photons, each with an energy ofat least 30 MeV and an invariant mass within 19 MeV/ c of the nominal pion mass. Photons are identified as en-ergy depositions in the calorimeter above 20 MeV with-out an associated track. Charged tracks are identifiedas pions or kaons using a likelihood ratio constructedfrom CDC, ACC, and TOF information. Charged-trackquality is improved by requiring that | dz | < . dr < . | dz | and dr are the distances of clos-est approach of the track to the interaction point alongthe beam axis and in the transverse plane, respectively.The efficiency of the B tag full reconstruction dependson the complexity of the decay of the signal-side B meson.For the simple B + → (cid:96) + ν (cid:96) γ process, a relatively highefficiency of 0.6% is found in the signal MC for correctlyreconstructed B tag candidates; for b → c processes, theefficiency lies around 0.2%.The full reconstruction contains a separate neural net-work (NN) for each particle type and decay mode andis trained with the NeuroBayes software [12]. Impor-tant input variables for the NN output of the final B tag meson include: the network outputs of the daughterparticles; the reconstructed masses of the daughters;∆ E = E B tag − E beam , which is the difference betweenthe B tag candidate energy and the beam energy in thecenter-of-mass system (CMS); the mass difference be-tween M ( D ∗ ) and M ( D ); the angles between the daugh-ters in the B tag meson rest frame; the momentum ofthe daughters in the lab frame; and cos Θ B , the cosineof the angle between the beam and the B tag direction.The network output can be interpreted as the probabil-ity that the B tag candidate is correctly reconstructed,which means all particle hypotheses of the decay chainare correct. In the case of multiple B tag candidates, thecandidate with the highest network output is selected.For the network output, differences between data andMC have been observed [13]; B tag decay modes with atleast two pions in the final state show the largest devi-ation. In charmed semileptonic signal-side B decays theefficiency in MC is overestimated by approximately onethird. From that, a correction factor depending on thehadronic tag-side decay channel is obtained, and it is ap-plied to all MC samples used in the analysis.For the analysis, additional event shape variables areadded to the network training. The variables are used todiscriminate between spherical B ¯ B and jet-like q ¯ q contin-uum processes. The event shape variables are modifiedFox-Wolfram moments [14] and the thrust axis of the B tag meson candidate.4 V. SELECTIONA. Missing mass
With the B tag candidate three-momentum (cid:126)p B tag , thefour-momentum of the signal-side B sig meson in the CMSis given by p B sig = ( E beam /c, − (cid:126)p B tag ). This makes useof the two-body decay kinematics of the Υ(4S) and themeasured CMS boost of the B ¯ B system. The B sig four-momentum is used to compute the squared missing mass,which is the strongest discriminator between signal andbackground. The variable is defined as m = ( p B sig − p (cid:96) − p γ ) /c , where the four-momenta of the daughter lepton and pho-ton are subtracted from that of the B sig candidate. Forcorrectly reconstructed signal events, the variable corre-sponds to the neutrino mass and therefore peaks aroundzero. The resolution of this signal peak is improved byusing E beam instead of E B tag in p B sig . An additional im-provement in resolution is achieved for B + → e + ν e γ de-cays by taking bremsstrahlung into account: the four-momentum of the signal electron candidate is correctedby the addition up to photon below an energy of 1 GeVwithin a five degree cone around the direction of itsmomentum. For the signal extraction, the region with m ∈ ( − . , .
0) GeV /c around the signal peak isused.The analysis begins with a selection with high signalefficiency and purity, followed by a signal-yield extractionwith a fit to the missing mass in bins of a NN output.The number of network-output bins as well as the selec-tion of variables used in the training of the network areoptimized for signal significance. With the exception ofthe lepton identification (ID), the selection is identicalfor both B + → e + ν e γ and B + → µ + ν µ γ . B. Tag-side selection
For the B tag candidate, the beam-energy-constrainedmass M bc = (cid:113) E − (cid:126)p B tag /c is required tobe greater than 5.27 GeV/ c . A selection of∆ E ∈ ( − . , .
10) GeV is applied; this variable isnot used elsewhere since it is strongly correlated withthe missing mass. A loose selection on the networkoutput of the fully reconstructed B tag meson is chosento have a probability above 2 × − of being correctlyreconstructed. C. Signal-side selection
After hadronic tag-side reconstruction, one chargedtrack and one high-energy photon are expected in the de-tector. No additional charged tracks beyond the signal’s lepton daughter are permitted. The signal-side charged-track selection demands the same selection for the impactparameters as the tag-side: dr < | dz | < B tag . Curling tracks, which canbe counted twice, are taken into account on the signalside by counting two tracks as one if the cosine of theangle between them is above 0.999 and their transversemomentum differs by less than 30 MeV/ c .Electrons are identified from a likelihood formed withinformation from multiple detectors: the energy loss inthe CDC; the ratio of energy deposition in the ECL tothe track momentum; the shower shape in the ECL; thematching of the charged track to the shower position inthe ECL; and the photon yield in the ACC [15]. Muonsare identified from charged tracks extrapolated to theouter detector; the difference between the expected andmeasured penetration depth of the track as well as thetransverse deviation of KLM hits from the extrapolatedtrack are used to distinguish muons from hadrons [16].Adding the particle ID to the final selection, 95% (99%)of events with a wrong-lepton hypothesis are vetoed witha reduction in signal selection efficiency of about 2%(1.2%) for the muon (electron) channel.The analysis is performed with two energy thresholdsof 1 GeV and 400 MeV for the signal photon candidate inthe B sig rest frame, where the most energetic photon inthe B sig rest frame is identified as the signal photon can-didate. The 1 GeV threshold is a lower bound for whichthe theoretical model is valid; however, a secondary anal-ysis covering a larger phase space is performed, with a400 MeV bound chosen to remove the divergent part inthe decay model at lower energies. The missing momen-tum in the event | (cid:126)p sig ν | has to be above 800 MeV/ c in the B sig rest frame, to be consistent with the presence of ahigh energy neutrino.Events in which a signal photon candidate is mis-reconstructed from bremsstrahlung radiation originatingfrom the signal electron are vetoed by requiring that thecosine of the angle between the lepton and photon can-didates in the B sig rest frame (cos Θ γ(cid:96) ) lie below 0.6.For the cosine of the angle between the missing mo-mentum and the signal photon candidate in the B sig rest frame (cos Θ γν ) a discrepancy is observed betweenMC and data for values below − . M bc < .
27 GeV /c ; therefore cos Θ γν is selected to belarger than − .
9. The remaining energy in the ECL isthe summed energy of clusters not associated with signalor tag-side particles and is required to be below 900 MeV.Here, clusters are required to have energies above of 50,100, and 150 MeV for the barrel, forward, and backwardend-cap calorimeter, respectively. These energy thresh-olds with directional dependence are proven to veto back-ground in the detector not related to physical processes.To suppress the main background of B + → (cid:96) + ν (cid:96) π decays, a π veto is constructed that combines the signalphoton candidate with all remaining photons in the ECLabove an energy of 100 MeV to compute the invariant5 (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (a) Electron channel ) (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (GeV m-2 -1 0 1 2 3 4 E n t r i e s / . G e V (b) Muon channel FIG. 1. (color online) Distributions of m on data (points with error bars) in bins of the network output. The PDFs are forsignal (solid blue), enhanced signal (dashed violet), fixed B → X u (cid:96) + ν (cid:96) backgrounds (dash-dotted green), fitted backgrounds(dotted red), and the sum (solid black). The enhanced signal function, which has the same normalization for each bin,corresponds to a branching fraction of 30 × − . The most signal-like bin is found in the upper left panel. Proceeding from leftto right, the distributions become increasingly more background-like and the most background-like bin is shown in the lowerright panel. mass, where only the candidate closest to the nominal π mass is kept. A π mass is only computed if at leastone remaining photon above an energy of 100 MeV is left in the ECL. The number of events with a computed π mass decreases with a rising energy threshold, as does thenumber of events vetoed by a selection on the resulting6 (GeV m -2 -1 0 1 2 3 4 E n t r i e s / . G e V MC g e n + e fi + B MC l n l u X fi B Fitted background MCData sample MC enhanced g e n + e fi + B (a) Electron channel ) (GeV m -2 -1 0 1 2 3 4 E n t r i e s / . G e V MC g m n + m fi + B MC l n l u X fi B Fitted background MCData sample MC enhanced g m n + m fi + B (b) Muon channel FIG. 2. (color online) Unbinned m distribution wherethe enhanced signal corresponds to a branching fraction of30 × − . mass spectrum. On the other hand, an increasing energythreshold improves the signal and background separationsince fewer photons are combined with the signal photoncandidate. This reduces the possibility of calculating a π mass close to the correct one by chance. The 100 MeVthreshold is chosen to ensure a high signal efficiency ofabout 99% while achieving a good background rejectionof 45% for B + → (cid:96) + ν (cid:96) π processes, when a window of30 MeV/ c around the nominal π mass is vetoed.The overall signal selection efficiency after full recon-struction is 47% (45%) for the muon (electron) channel.The expected event numbers from the background MCsamples are: 328 (299) for b → c decays, 78 (76) for b → u(cid:96) + ν (cid:96) decays, and 17 (6) events from non-resonant q ¯ q → ( u, d, s, c ) processes for the muon (electron) chan-nel. The contribution from b → s processes is found tobe negligible. Network output E n t r i e s / . MC g e n + e fi + B MC l n l u X fi B Fitted background MCData sample MC enhanced g e n + e fi + B (a) Electron channel Network output E n t r i e s / . MC g m n + m fi + B MC l n l u X fi B Fitted background MCData sample MC enhanced g m n + m fi + B (b) Muon channel FIG. 3. (color online) Network outputs used for m binningwhere the bin boundaries are indicated by the dashed lines.The normalizations of the MC distributions are taken fromthe fit results in m and the enhanced signal correspondsto a branching fraction of 30 × − . D. Neural network training
To further optimize the signal selection, another NNis formed with the NeuroBayes package [12]. This soft-ware computes each input variable’s significance from thetraining; this is used to retain only the most significantvariables in the network. The variables included in thetraining are: the extra energy in the ECL, cos Θ γ(cid:96) , andcos Θ γν . To further separate the main background pro-cesses of B + → (cid:96) + ν (cid:96) π and B + → (cid:96) + ν (cid:96) η , where the π and η decay into two photons and one of the photons ismisidentified as the signal photon, meson-veto variablesare incorporated into the network. These are computedin the same way as for the selection above but with dif-ferent energy thresholds on the remaining photons in theECL.The thresholds are increased in 10 MeV steps from 20to 100 MeV. The number of photons combined with thesignal photon candidate depends on this energy thresh-old, and since only the combination closest to the nominal7ass is taken into account, different photon combinationsend up in the mass spectrum. This leads to different in-variant mass spectra with complementary information.The η invariant mass is computed in the same way, withenergy thresholds between 20 and 300 MeV. Only the sixmost significant meson masses are retained in the train-ing.Signal MC samples of both signal channels are trainedsimultaneously against the b → u(cid:96) + ν (cid:96) MC and the high-luminosity B + → (cid:96) + ν (cid:96) π MC sample. For the secondaryanalysis with E sig γ >
400 MeV, the angles cos Θ γ(cid:96) andcos Θ γν are excluded from the training to reduce the sig-nal model dependence of the result. V. SIGNAL EXTRACTIONA. Fit model
The signal yield is determined by an extended un-binned maximum likelihood fit to the m distributionin six bins of the NN output. The likelihood function isgiven byln L = N tot (cid:88) j =1 ln (cid:110) N c (cid:88) i N i P i ( m , n out ) (cid:111) − N c (cid:88) i N i , where N tot is the total number of events in the data set, N c denotes the number of components in the fit, N i isthe number of events for the i th component, and P i rep-resents the probability density function (PDF) for thatcomponent as a function of m and the network output n out .The fit model consists of three components: B + → (cid:96) + ν (cid:96) γ signal; measured b → u(cid:96) + ν (cid:96) , decaysreferred to hereinafter as the B → X u (cid:96) + ν (cid:96) component;and a component denoted as “fitted background” thatincludes unmeasured b → u(cid:96) + ν (cid:96) contributions, resonant b → c decays, and non-resonant q ¯ q processes. In thefit to data, the expected yield of the B → X u (cid:96) + ν (cid:96) component containing the known decay modes with X u = π , η , ω , ρ , π + , ρ + , and η (cid:48) is fixed according tothe world average values of the branching fractions [17].The shapes of the three components are determined fromMC in each network output bin separately and fixed inthe fit to data together with the relative normalizationsamong the bins. The PDF for the i th component is givenby P i ( m , n out ) = f n out i P n out i ( m ) , where f n out i denotes the fixed fraction of N i events in thebin and P n out i is the PDF in that NN bin with centralvalue n out .By design, each bin contains the same number of ex-pected signal events and the bin boundaries are shown inFig. 3. The number of network output bins is chosen tomaximize the expected significance of the signal, which is determined in toy MC studies. The number of signaland fitted background events are the two free parametersof the fit model. The two signal channels B + → e + ν e γ and B + → µ + ν µ γ are measured in separate fits. A si-multaneous fit to both channels is performed to measurethe B + → (cid:96) + ν (cid:96) γ branching fraction. Lepton universalityis assumed for the latter measurement, where the signalbranching fractions of the two channels are fixed to thesame value. To avoid a fit bias, all yields are uncon-strained and negative values are allowed in the fit.The signal component is parametrized with the sumof a Crystal Ball function [18] and a Gaussian with acommon mean. A shape for the fitted background com-ponent is given by an exponential with a polynomial inits argument f ( x ; x , α, β ) = e α ( x − x ) + β ( x − x ) . The fixed background component of B → X u (cid:96) + ν (cid:96) decaysis modeled with a non-parametric PDF using a kernel es-timation algorithm [19], where each data point is repre-sented by a Gaussian and their sum yields a probabilitydensity function. The width of the Gaussian kernels isa parameter of the algorithm that is chosen to producea smooth description of the MC. Identical functions arefitted for both signal channels. B. Significance and limit determination
The significance of the signal is defined as (cid:112) − L b / L ( s + b ) ) where L b and L ( s + b ) are themaximum likelihood value of the background and signalplus background model, respectively. The maximumlikelihood values for null and signal hypothesis are ob-tained from the likelihood profile, where both likelihoodvalues are taken from the same data distribution. Anupper limit at 90% credibility level [1] is determinedfrom an integration of the likelihood function up to the90% quantile, where only the range for positive signalyields is used. The systematic uncertainty is includedby convolving the likelihood function with a Gaussianwhose width is equal to the systematic error. Systematicerrors affecting only the signal yield are included in thedetermination of the significance. The total systematicerror, including errors impacting the overall yield, is usedfor the measurement of the branching fraction and itsupper limit. Since the systematic errors are asymmetric,the downward errors are used for the significance andthe upward errors for the upper limit. The expected fitresults from an average over many toy MC studies arelisted in Table I for the nominal and secondary analyses.The expected signal yield depends on the value of λ B . The expected fit significances are determinedwith a signal branching fraction of 5 × − and theexpected upper limits are measured without any signalcontribution. For the simultaneous fit, a significance of2.9 σ including systematic errors is expected.8 ominal analysis with E sig γ > σ ) B limit (10 − ) Yield B (10 − ) Significance ( σ ) B limit (10 − ) B + → e + ν e γ ± +1 . − . < . . +4 . . − . − . . +3 . . − . − . < . B + → µ + ν µ γ ± +1 . − . < . . +3 . . − . − . . +2 . . − . − . < . B + → (cid:96) + ν (cid:96) γ ± +1 . − . < . . +5 . . − . − . . +1 . . − . − . < . E sig γ >
400 MeVMC expectation Data measurementMode Yield Significance ( σ ) B limit (10 − ) Yield B (10 − ) Significance ( σ ) B limit (10 − ) B + → e + ν e γ ± +1 . − . < . . +7 . . − . − . . +2 . . − . − . < . B + → µ + ν µ γ ± +1 . − . < . − . +5 . . − . − . - - < . B + → (cid:96) + ν (cid:96) γ ± +3 . − . < . . +8 . . − . − . . +1 . . − . − . < . B ( B + → (cid:96) + ν (cid:96) γ ) = 5 × − and measured signal yields on data, wherethe first error is statistical and the second error systematic. The significances and credibility levels contain systematic errors.The credibility levels are given at 90% where the expected MC limit is determined without signal.Nominal analysis with E sig γ > E sig γ >
400 MeVMode MC expectation Measured yield MC expectation Measured yield B + → e + ν e γ ± . +20 − ± . +29 − B + → µ + ν µ γ ± . +20 − ± . +29 − TABLE II. Fitted background yields compared to the MC prediction with statistical errors only.
C. Toy MC and sideband data checks
The fit model is checked for a bias in extended toy MCstudies where, pull distributions are used to quantify thesize of the bias. The pull distributions are computed fromthe deviation from the true value divided by the fit er-ror and have a standard normal distribution for unbiasedfits. This is used in a linearity test of the signal yield,which checks whether the bias of the fit results dependson the signal branching fractions. The pull distributionsare in agreement with standard normal distributions, in-dicating no bias for branching fractions that result in asignificant measurement. A test of the credible inter-val [1] coverage counts the number of events for whichthe true value is contained inside the 90% interval. Fora branching fraction of 5 × − , 95% of the true valuesare contained inside the interval; this number increases tomore than 99% below a branching fraction of 3 × − .Since the likelihood is only integrated for positive signalyields to determine the limit, the 90% quantile is movedto higher values. Therefore, the upper limit is a con-servative measure. The same results are found for thesecondary analysis.The background MC shapes are compared to datain the M bc < .
27 GeV/ c sideband. Additionally, the agreement of the input variables to the NNis checked in a B → X u (cid:96) + ν (cid:96) enhanced region of m ∈ (0 . , .
0) GeV /c and a generic backgrounddominated region of m ∈ (1 . , .
0) GeV /c . All con-sidered distributions agree between data and MC, exceptfor the previously mentioned discrepancy in the cos Θ γν distribution. VI. MEASUREMENT
The fit results are listed in Table I and the m dis-tributions for the nominal analysis are shown in Fig. 1 forboth signal channels. No significant signal is found in anyof the fits. To offer a better overview of the fit results,unbinned distributions of the results are shown in Fig. 2.Good agreement between data and MC for the networkoutput is shown in Fig. 3. The fitted background yieldsin the data are in agreement with the MC prediction, asshown in Table II. Assuming that only a few signal eventsare found below the photon energy threshold of 400 MeV,the partial branching fractions of the secondary analysiscan be compared to the BaBar measurement [6] for thewhole energy range. Limits on λ B are computed by in-9 ource B + → µ + ν µ γ B + → e + ν e γ Fit shapes +0 . − .
34 +0 . − . Meson veto network ± . ± . B → X u (cid:96) + ν (cid:96) yield ± . ± . B + → (cid:96) + ν (cid:96) γ model − . − . +0 . − .
47 +0 . − . Lepton ID ± . ± . ± . ± . ± . ± . − . − . B ¯ B ± . ± . ± . ± . +1 . − .
58 +1 . − . Source B + → (cid:96) + ν (cid:96) γ Additive Error +1 . − . Multiplicative Error ± . +1 . − . TABLE III. Systematic uncertainties on the signal yieldgrouped by error-types for the nominal analysis with E sig γ > tegrating the differential decay width from Equation 1∆ B = τ B d (cid:126) m B / c (cid:90) dE γ d Γ dE γ and solving for λ B , where the integral includes the partialphase space E sig γ > B meson mass.The input parameters for the differential decay width aretaken from Ref. [3] and the value for the soft correction ξ ( E γ ) is taken from Ref. [5]. All parameters are variedby their uncertainties to obtain parameter combinationsyielding minimal and maximal values for λ B . With the B + → (cid:96) + ν (cid:96) γ limit of the nominal analysis, a central value λ B >
238 MeV is obtained at 90% credibility level. Thelimit changes within a range of λ B > (172 , Similar values are obtainedfor the secondary analysis.
VII. SYSTEMATIC UNCERTAINTIES
Systematic errors are estimated in toy MC studieswhere the default and the varied fit models are applied to Several values of ξ ( E γ ) are calculated in Ref. [5] for different truevalues of λ B . We identify the central value of ξ ( E γ ) with the oneobtained for λ B = 300 MeV. To obtain the error on ξ ( E γ ), thewhole range of true values for λ B is taken into account. the same toy sample and the difference in signal yield istaken as a systematic deviation averaged over many toymeasurements. The results are shown in Table III for thenominal analysis.The largest error is given by the variation of the fitshapes, where the 1 σ fit error from MC is varied. For thenon-analytical shape obtained from the kernel estimatoralgorithm, the size of the Gaussian kernels is varied toobtain a considerable shape variation.The systematic error on the meson-veto network is ob-tained from the control channel B → K ∗ γ . Here, thesignal photon candidate is combined with the remainingphoton candidates to compute the meson mass spectraand obtain the network output distribution. From thisdistribution, a double ratio of data and MC is calculatedas ( N MC i /N MCsum ) / ( N data i /N datasum ), where N i is the eventcount in the i th bin and N sum the total number of events.The largest deviation between data and MC is found tobe 8% in the most background-like network output bin.An alternate model is obtained by using the double ra-tio values to reweight the binned m distribution in B + → (cid:96) + ν (cid:96) γ . The angles cos Θ γ(cid:96) and cos Θ γν , as wellas the remaining energy in the ECL, cannot be used inthe NN trained on the control sample. Therefore, a sep-arate network without these variables is trained on the B + → (cid:96) + ν (cid:96) γ samples, which is then used to obtain thedouble ratios in the control channel.The fixed yields of the measured B → X u (cid:96) + ν (cid:96) back-grounds are varied by their world-average errors [17]. Thesystematic uncertainty related to the B + → (cid:96) + ν (cid:96) γ decaysignal model is estimated by comparing the latest NLOmodel [3] with an older LO calculation [20]. Here, theshape difference in the m distribution is found to besmall and parametric errors of the theory are also foundto have a negligible effect on the branching fraction de-termination.The systematic uncertainty related to lepton ID is de-termined in γγ → (cid:96) + (cid:96) − processes and the error is foundto be 2.2% and 5.0% for electrons and muons, respec-tively. The error for the tag-side efficiency has been de-termined in Ref. [13] to be 4.2%. The error for the tag-side NN is taken from the sideband m > . /c ,where the difference in the data-MC selection efficiencyis taken as a systematic error. Systematic deviations forthe tracking efficiency are determined with high trans-verse momentum tracks from partially reconstructed D ∗ mesons; the deviation is − . VIII. CONCLUSION
In summary, we report the upper limits of the par-tial branching fraction with E sig γ > B + → (cid:96) + ν (cid:96) γ decays with the full Belle data set of(771 . ± . × B ¯ B pairs. The signal photon energyrequirement ensures a reliable theoretical description ofthe decay process. The results at 90% credibility levelare B ( B + → e + ν e γ ) < . × − , B ( B + → µ + ν µ γ ) < . × − , B ( B + → (cid:96) + ν (cid:96) γ ) < . × − . These results improve the limits measured by BaBar [6].The limit of the combined channel B + → (cid:96) + ν (cid:96) γ trans-lates into a boundary of λ B >
238 MeV at 90% cred-ibility level, where this limit evolves within the range λ B > (172 , E sig γ > ACKNOWLEDGEMENTS
We thank the KEKB group for the excellent operationof the accelerator; the KEK cryogenics group for the ef-ficient operation of the solenoid; and the KEK computergroup, the National Institute of Informatics, and thePNNL/EMSL computing group for valuable computingand SINET4 network support. We acknowledge supportfrom the Ministry of Education, Culture, Sports, Science,and Technology (MEXT) of Japan, the Japan Society forthe Promotion of Science (JSPS), and the Tau-LeptonPhysics Research Center of Nagoya University; theAustralian Research Council and the Australian De-partment of Industry, Innovation, Science and Research;Austrian Science Fund under Grant No. P 22742- N16 and P 26794-N20; the National Natural ScienceFoundation of China under Contracts No. 10575109,No. 10775142, No. 10875115, No. 11175187, andNo. 11475187; the Ministry of Education, Youthand Sports of the Czech Republic under ContractNo. LG14034; the Carl Zeiss Foundation, the DeutscheForschungsgemeinschaft and the VolkswagenStiftung;the Department of Science and Technology of India; theIstituto Nazionale di Fisica Nucleare of Italy; NationalResearch Foundation (NRF) of Korea Grants No. 2011-0029457, No. 2012-0008143, No. 2012R1A1A2008330,No. 2013R1A1A3007772, No. 2014R1A2A2A01005286,No. 2014R1A2A2A01002734, No. 2014R1A1A2006456;the Basic Research Lab program under NRF GrantNo. KRF-2011-0020333, No. KRF-2011-0021196,Center for Korean J-PARC Users, No. NRF-2013K1A3A7A06056592; the Brain Korea 21-Plusprogram and the Global Science Experimental Data HubCenter of the Korea Institute of Science and TechnologyInformation; the Polish Ministry of Science and HigherEducation and the National Science Center; the Ministryof Education and Science of the Russian Federationand the Russian Foundation for Basic Research; theSlovenian Research Agency; the Basque Foundationfor Science (IKERBASQUE) and the Euskal HerrikoUnibertsitatea (UPV/EHU) under program UFI 11/55(Spain); the Swiss National Science Foundation; theNational Science Council and the Ministry of Educa-tion of Taiwan; and the U.S. Department of Energyand the National Science Foundation. This work issupported by a Grant-in-Aid from MEXT for ScienceResearch in a Priority Area (“New Development ofFlavor Physics”) and from JSPS for Creative ScientificResearch (“Evolution of Tau-lepton Physics”). [1] In common HEP usage, Bayesian intervals or credibil-ity levels have been reported as “confidence intervals” or“confidence levels,” which is a frequentist-statistics term.[2] Throughout this Letter, the inclusion of the charge-conjugate decay mode is implied.[3] M. Beneke and J. Rohrwild, Eur. Phys. J. C , 1818(2011).[4] M. Beneke et al. , Nucl. Phys. B , 313 (2000).[5] V. M. Braun and A. Khodjamirian, Phys. Lett. B ,1014 (2013).[6] B. Aubert et al. (BaBar Collaboration), Phys. Rev. D , 111105 (2009).[7] S. Kurokawa and E. Kikutani, Nucl. Instr. and Meth. A , 1 (2003) and other papers included in this volume;T. Abe et al. , Prog. Theor. Exp. Phys. , 03A001(2013) and following articles up to 03A011.[8] A. Abashian et al. (Belle Collaboration), Nucl. Instr.and Meth. A , 117 (2002); also see detector sectionin J. Brodzicka et al. , Prog. Theor. Exp. Phys. ,04D001 (2012).[9] D.J. Lange, Nucl. Instr. and Meth. A , 152 (2001). [10] R. Brun et al. , CERN Report No. DD/EE/84-1, (1984).[11] M. Feindt et al. , Nucl. Instr. and Meth. A , 432(2011).[12] M. Feindt et al. , Nucl. Instr. and Meth. A , 190(2006).[13] A. Sibidanov et al. (Belle Collaboration), Phys. Rev. D , 032005 (2013).[14] G.C. Fox and S. Wolfram, Phys. Rev. Lett. , 1581(1978). The modified moments used in this paper aredescribed in S.H. Lee et al. (Belle Collaboration), Phys.Rev. Lett. , 261801 (2003).[15] K. Hanagaki, H. Kakuno, H. Ikeda, T. Iijima, andT. Tsukamoto, Nucl. Instr. and Meth. A , 490 (2002).[16] A. Abashian et al. , Nucl. Instr. and Meth. A , 69(2002).[17] K. A. Olive et al. (Particle Data Group), Chin. Phys. C , 090001 (2014).[18] T. Skwarnicki, Ph.D. Thesis, Institute for NuclearPhysics, Krakow 1986; DESY Internal Report, DESYF31-86-02 (1986).[19] K.S. Cranmer, arXiv:hep-ex/0011057 (2000).
20] G.P. Korchemsky, D. Pirjol, T.M. Yan, Phys. Rev. D ,114510 (2000).,114510 (2000).