Citizen Scientist Community Engagement with the HiggsHunters project at the Large Hadron Collider
CCitizen Scientist Community Engagement with the HiggsHuntersproject at the Large Hadron Collider
A.J Barr a , A. Haas b , and C.W. Kalderon a,ca Department of Physics, University of Oxford, Oxford, UK b Department of Physics, NYU, New York, USA c Department of Physics, University of Lund, SwedenNovember 15, 2017
Abstract
The engagement of Citizen Scientists with the
HiggsHunters.org citizen science project is in-vestigated through analysis of behaviour, discus-sion, and survey data. More than 37,000 CitizenScientists from 179 countries participated, clas-sifying 1,500,000 features of interest on about39,000 distinct images. While most Citizen Sci-entists classified only a handful of images, someclassified hundreds or even thousands. Analysisof frequently-used terms on the dedicated dis-cussion forum demonstrated that a high level ofscientific engagement was not uncommon. Ev-idence was found for a emergent and distincttechnical vocabulary developing within the Cit-izen Science community. A survey indicates ahigh level of engagement and an appetite for fur-ther LHC-related citizen science projects.
The Large Hadron Collider is arguably the high-est profile scientific project of our time. Thediscovery of the Higgs boson [1, 2] has been thescientific highlight to date. The accelerator con-tinues to be the subject of much media attentionas searches for other new particles continue.Matching this cutting-edge science with thepublic’s curiosity to understand it can present achallenge. The particles created at the LHC arethemselves invisible. Many, including the Higgsboson, decay a tiny fraction of a second aftertheir creation, and can only be detected and re-constructed using large dedicated detectors as-sembled over decades by large international col-laborations.Nevertheless, there is a strong drive within sci-ence policy to allow the public to be involved1 a r X i v : . [ phy s i c s . pop - ph ] O c t igure 1: An example ATLAS detector imagepresented to citizen scientists. This image con-tains two off-centre vertices, each visible as a vee-like structure, at about 4 o’clock and 7 o’clock,a little distance from the center of the image.The image was generated from a computer sim-ulation. in not just reading about science, but actuallyperforming it. Citizen science projects – whichdirectly involve the public in the scientific pro-cess – represent an ideal vehicle for meaningfulengagement with a large community. Particu-lar citizen science projects previously have beenshown to reveal that participants were engagedin thinking processes similar to those of scientificinvestigations [3]. Crowdsourced research has it-self been shown to be reliable, scalable, and con-nective [4].When considering what might be viable citi-zen science projects for the particular case of theLHC reported here, several factors were consid-ered. The subject matter should be sufficientlyappealing to attract a sufficient number of citi-zen scientists. The tasks assigned to the citizenscientists must be within their capability, or pos-sible to be rapidly understood, to maintain vol-unteers’ interest. And to motivate continued en-gagement there should be the possibility of mak-ing a very significant contribution to knowledge.It was noted that citizen scientists have pre-viously been shown to be good classifiers of im-ages [5]. They are also efficient at spotting un-usual objects in images including unexpectedgalaxy features [6]. Through the Galaxy Zoo [7]project alone, citizen scientists have contributedto the results of 48 scientific papers [5]. Thepresent study evaluates, using the data fromthe HiggsHunters.org project described below,the extent to which analysis by citizen scientistsmight also be possible at the Large Hadron Col-2ider, and the engagement of those citizen scien-tists with that subject matter.Previously within the field of particle physics,the public has been invited to contribute toCERN’s science by donating idle time on theircomputer to help simulate proton-proton col-lisions [8, 9]. That project aids the scien-tific endeavour, however the the volunteers areproviders of computing resource rather than ac-tive researchers. More direct involvement in theresearch has previously been restricted to therelatively small fraction of the public that hasa high level of computing coding skills. Suchindividuals have been able to directly analysedata from CERN experiments via the CERNopendata portal [10]. The Kaggle project [11]in which members of the public were challengedto use machine learning to identify Higgs bosonevents was very successful, but also demanded ahigh level of coding expertise, making it inacces-sible to most members of the public.The HiggsHunters project is, to the best ofour knowledge, the first to allow the non-expertgeneral public a direct role in searching for newparticles at the LHC.For the
HiggsHunters.org project, a task wascreated which lent itself well to the strengths ofnon-expert citizen scientists – in particular theirabilities to classify elements in images, and tospot unusual features.The task selected was to ask citizen scien-tists to identify any sets of tracks originatingfrom points away from the centre of the im- ‘Baby’ bosons
The physics theories under test predictthe existence of hypothetical new parti-cles φ which are not in the StandardModel of particle physics and whichhave not yet been observed experimen-tally. In such theories the usual Higgsboson H , after it is created, would mostoften decay as predicted by the Stan-dard Model, however a fraction of thetime it would decay into the new parti-cles: H → φ + φ. The new particles φ interact with theStandard Model only very weakly. Thisweak coupling means they have a slowdecay rate, and hence a relatively longlifetime on the particle scale – typicallyof order nanoseconds. They can there-fore travel a macroscopic distance, per-haps tens of centimetres, before them-selves decaying.3ge – known as Off-Centre Vertices (abbreviatedOCV). Such tracks can be observed in the im-age of a simulated collision shown in figure 1.Such features indicate the presence of a rela-tively long-lived neutral particle, which travelledsome centimetres from the interaction point atthe centre of the image before decaying produc-ing spray of a large number of tracks.Collective evidence from the body of citi-zen scientists about these OCVs could indicatenew particles beyond the knowledge of particlephysics – dramatically changing our understand-ing of the subatomic realm. The high impact ofa potential discovery meets the important moti-vating feature of citizen science projects that thevolunteers have a real opportunity of discoveringsomething previously unknown to science [12].It also satisfies the ethical criterion [13] that thetime of the citizen scientists is being used pro-ductively.The citizen scientists were also given the taskof identifying anything they thought was ‘weird’in any image. Serendipity can have an importantrole in scientific discovery, so it was consideredimportant to flag such particularly unusual fea-tures.The citizen science web interface was con-structed within the Zooniverse [14] framework,using images from the ATLAS experiment at theLarge Hadron Collider. Both images from realcollisions and those from Monte Carlo simula-tions were displayed, with the Citizen Scientistbeing unaware (at the time of classification) as to whether the image was based on real or simu-lated data. The ability of volunteers to identifythe off-centre vertices could then be calibratedusing the test images which showed simulationsof the decay processes of interest.All images, whether simulation or from realcollisions, were processed using the ATLASreconstruction software [15], with some addi-tions [16].
As of October 2017, classifications had been per-formed by 57,613 Citizen Scientists, of whom25,608 had created Zooniverse accounts. NewCitizen Scientists are invited to create a Zooni-verse account after their first five classifications,and periodically thereafter. For those classifi-cations made without Zooniverse accounts it isassumed that classifications from different IP ad-dresses are distinct scientists.The number of Citizen Scientists is shown infigure 2, as is the cumulative number of classi-fications. There was a rapid rise in the numberof new scientists soon after the project launch,when the project was advertised via email to ex-isting Zooniverse account holders as well as inthe press and social media. Subsequent peri-ods during which new Citizen Scientists are at-tracted are observed, for example in July 2016when a CERN news story was published aboutthe project [17].The number of classifications per Citizen Sci-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U n i q u e C i t i z e n S c i e n t i s t s As of 2017-10-15All participantsLogged inNot logged in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T o t a l nu m b e r o f i m a g e s e x a m i n e d As of 2017-10-15All participantsLogged inNot logged in
Figure 2: (a) Cumulative number of unique Cit-izen Scientists as a function of time. (b) Cumu-lative number of images examined as a functionof time. Images classified per Citizen Scientist N u m b e r o f C i t i z e n S c i e n t i s t s As of 2017-10-15All participantsLogged inNot logged in
Figure 3: Number of classifications per CitizenScientist.entist (figure 3) follows an approximate power-law behaviour. Most volunteers dip in to clas-sify just a handful of images, though more thana thousand individuals provided one hundred ormore classifications. At the upper end of thedistribution, more than one hundred volunteersprovided more than 1,000 classifications, withthe most dedicated enthusiast providing morethan 25,000 classifications.Several moderators were selected from amongthe Citizen Scientists active on a dedicated ‘Talk’discussion forum [18] to help answer questionsfrom other, less experienced volunteers. Themoderators helped newer volunteers with identi-fication of objects, and with some of their sciencequestions. Other scientific questions were ad-dressed by the science team, either via the Talkforum or in the project’s blog forum [19].5
Science Objectives
An initial determination [20] has previously beenmade of the performance of citizen scientists rel-ative to computer algorithms that were devel-oped and used by the ATLAS collaboration toidentify off-centre vertices [16].It was found that the performance of the Cit-izen Scientists competed very well with that ofthe computer algorithm. The collective abilityof the Citizen Scientists was superior to the AT-LAS computer algorithm for simulations withlow-mass long-lived particles. A detailed com-parison of the identification performance of theCitizen Scientists relative to the computer algo-rithm is described in Ref [20].In addition to being able to mark off-centrevertices, the Citizen Scientists are also encour-aged to select anything ‘weird’ in the images, andto follow up these on the Talk forum where thewider community discusses them. This raisedseveral instances of known phenomena, such ascosmic ray showers passing through ATLAS, butalso some that were unexpected, demonstratingthe potential for untrained Citizen Scientists toisolate interesting features in real LHC collisiondata.
The Zooniverse platform provides a forum forCitizen Scientists to build community, discussobjects and images, and to ask questions. The forum is open to all Citizen Scientists, modera-tors and project scientists.An analysis was performed of the content ofthe 20,257 comments received between Novem-ber 2014 and May 2017. These comments werereceived from 1345 different Citizen Scientists.The distribution of the number of words percomment is found to follow a falling exponen-tial form with a mean of 6.6 words, and with 6%of comments being 20 words or more, which sug-gest substantial observations and/or questions.Frequently used words and hashtags are shownin figure 4 The most common hashtags are‘ W o r d c o un t s li c e v i e w o c v li n e s m u o n p o ss i b l ee n e r g y t r a c k s p a r t i c l e s c e n t e r li n e n o r m a l v e r t e x w h i t e m u o n s r e dg r ee n m a r k o ff - c e n t e r o \ ' c l o c k m o r e t r a c k d e t e c t o r b l u e y e ll o w q u a r k p a r t i c l e c o m p u t e r v e r t i c e s i m a g e H a s h t a g c o un t o c v w e i r d m i ss i n g m u o n p h o t o n m u o n b o tt o m m u l t i p l e b un d l e l o t s m e ss y e n e r g y d i a m e t r i c m e ss b o tt o m q u a r k t o u g h i e e l e c t r o n w i d e a n g l e i n t e r e s t i n g m u o n s j e t y e ll o w f e n c e p a n e l s s i m p un c h - t h r o u g h r e d b o tt o m q u a r k a n dp h o t o n m i ss i n g h i gg s z c a n d i d a t e s e n e r g y d e p o s i t b l u e s i m u l a t i o n Figure 4: Frequently used words (above) andhashtags (below), and their frequency of use, in20,257 Talk forum comments. The most com-mon words used in everyday language such as‘a’ ‘the’, and ‘of’ are omitted. Citizen Scientist moderator in a Talk post, in-cluding: : Several particle tracks thatappear to share a common origin, butdo not meet at a vertex. : Many particles (or lotsof energy) located on opposite sides ofthe detector, with relatively little be-tween. : Objects which are compli-cated by many crossing lines, which canmake it difficult to find off-centre ver-tices.The hashtag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C o mm e n t s ( c u m u l a t i v e ) Comments matchingocvweirdmuonenergyphotonbundlejetdiametric
Figure 5: Cumulative number of commentsmatching particular words as a function of date. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U n i q u e C i t i z e n S c i e n t i s t s ( c u m u l a t i v e ) Term usedocvweirdmuonenergyphotonbundlejetdiametric
Figure 6: Cumulative number over time ofunique Citizen Scientists using particular words. and “muon” keep growing rapidly. The num-ber of unique Citizen Scientists using particularwords has continued to grow with time (figure 6).Seemingly the non-standard term “bundle” fellout of fashion after the first couple of months,being overtaken by the term “jet” which is theusual word for this feature within the wider par-ticle physics community.
To evaluate the impact of the project on the Cit-izen Scientists themselves, a web-based surveywas was undertaken, with an invitation to par-ticipate being sent to all registered HiggsHuntersvolunteers. The number of respondents was 322(including 63 partial responses). This responserate represents about 1% of those who partici-pated as Citizen Scientists in the project. Thesurvey was advertised via the Zooniverse web-site and in an email to those with Zooniverseaccounts, which is likely to have led to somebias towards respondents having a higher de-gree of engagement than average. This supposi-tion is supported by the observation that about80% of survey respondents had previously par-ticipated in another Zooniverse project prior toHiggsHunters.The gender of respondents was 33% femaleand 65% male (with 2% preferring not to say).A wide range of ages was represented (table 1).This is also reflected in the diversity of occupa-tions, with 19% of respondents being students,87% in full-time work, and 22% of respondentsretired (with the remainder having other em-ployment status). Well-represented occupationsincluded teachers, engineers, consultants, devel-opers and researchers. Respondents tended tobe well educated: 74% have at least an under-graduate degree, 39% had at least a masters de-gree and 14% held a doctoral degree. It was no-table that only about a quarter of those holdinga masters degree or higher held that degree in aphysics-related subject, showing that the projecthad appeal to those trained in other disciplines,particularly in other areas of science, technology,engineering and mathematics.The best-represented countries were the USA(25%) and the UK (16%), with a total of35 countries represented amongst all respon-dents. A bias towards native English speak-ers (65% of respondents) was perhaps unsurpris-ing given that the
HiggsHunters.org website isonly available in the English language.Of the respondents, 80% had engaged in citi-zen science before, in another science area, whilefor 20% it was their first citizen science project.About 62% were native English speakers, butmany other native languages were represented.Geographically, 33 were based in the USA, 21%in the UK, and many other countries were alsorepresented.More than 80% of respondents indicated thattheir knowledge of particle physics had been im-proved to some extent as a direct result of par-ticipating in HiggsHunters. In terms of future Age Percent Count16 to 17 7% 1718 to 19 3% 820 to 24 5% 1425 to 34 15% 3935 to 44 15% 3745 to 54 15% 3855 to 64 21% 5365 to 74 12% 3075 or older 4% 11Prefer not to say 3% 7Table 1: Age distribution of survey respondents.Change Percent CountA lot 13% 35Moderately 14% 37Slightly 20% 54No change 36% 98N/A 17% 46Table 2: Responses to the question “To whatextent are you more likely to study physics inthe future as a result of participating in Hig-gsHunters?”.Discussed. . . Percent Count. . . with your family? 58% 99. . . with friends? 64% 111. . . with colleagues? 32% 56. . . on social media? 15% 26Table 3: Answers of survey respondents tothe question “Have you ever discussed Zooni-verse. . . ”9irections, 47% of respondents said they weremore likely (to some extent) to go on to studyphysics as a result of participating in the project(table 2). This can be considered a high fraction,given the broad age range of participants.In terms of dissemination, many respondentshad discussed the project with others, includ-ing friends, family and work colleagues (table 3).This indicates the project had a multiplier effect,in that it reached more people than just thosecitizen scientists directly involved. This willing-ness to discuss with others also indicates a highlevel of feeling of ownership and interest amongthe citizen scientists themselves.The expectation that the survey respondentswere subject to a selection bias (compared tothe general population of HiggsHunters citizenscientists) towards more highly engaged end ofthe spectrum is confirmed from their responsesto a question asking about the duration of theperiod during which they performed classifica-tion (table 4). That distribution for respondentsis more broadly distributed than would be ex-pected from the general population of CitizenScientists, which peaks at low numbers of classi-fications (figure 3). Nevertheless, the wide rangeof different levels of duration among the respon-dents show that an interesting section of the Cit-izen Scientists has been being sampled, albeitwith some bias. No attempt has been made toextrapolate to the general population, since withthe numbers of people surveyed, insufficient in-formation is available about possible confound- Duration Percent CountA single session 10% 26One or two days 14% 352-7 days 15% 392-4 weeks 17% 441-5 months 18% 456-12 months 10% 26Over a year 15% 37Table 4: Response to the question “Over whatduration did you classify images?”ing factors which could significantly affect thatextrapolation.A significant minority (37%) of respondentshad browsed the Talk form, showing that whileof interest to many, it was far from ubiquitous.The fact that so many did not refer to the forumsuggests that the majority were able to performthe classification exercises without recourse tothe additional information on those discussionboards. The primary reason stated for postingto the boards was to discuss findings with otherCitizen Scientists.Most respondents reported that as a resultof the project they were motivated to engagemore fully with science (table 5) and the ma-jority also went on to work with other citizenscience projects (table 6).Overall the project was found to have a verypositive response from respondents, with mosthaving benefited from their engagement, and anoverwhelming majority (more than 97%) were10ubsequent activity Percent CountRead or watched more about science 87% 152Studied science more formally 29% 51Carried out your own research 20% 35Attended lectures or similar events 19% 33Attended science fairs or similar events 15% 26Table 5: Response to the question “
As a result of the HiggsHunters project , have you done any ofthe following?”Subsequent projects Percent CountNone (at time of response) 22% 62Zooniverse project(s) 74% 209non-Zooniverse project(s) 13% 38Table 6: Response to the question “Have yousubsequently participated in other citizen scienceprojects?”keen to continue participation in a future CERNphysics project.Further analysis of the citizen science clickdata will be performed by school children in col-laboration with the UK charity the Institute forResearch in Schools (IRIS) [21]. At the time ofwriting 61 schools had signed up for this projectthrough IRIS.
The first mass participation citizen scienceproject for the Large Hadron Collider has beenextremely successful. More than 37,000 citi- zen scientists participated, with a wide range ofages, backgrounds, and geographical spread rep-resented. More than 1.4 million features of in-terest were identified in images from the ATLASdetector.A study of behaviour showed that most Citi-zen Scientists classified just a handful of images,though a minority classified hundreds or thou-sands. A dedicate discussion forum allowed Cit-izen Scientists to interact with one another, andwith the project scientists. The vocabulary usedin the forum ranged from basic visual featuresto highly abstract and technical terms. The fre-quencies of some words in particular contexts in-dicated a distinct technical vocabularly emerg-ing from the Citizen Scientists’ discussions – onewhich would not immediately be understood byprofessional scientists in the field.The societal impact was evaluated from a ded-icated survey, with a very positive response. Al-most two thirds of respondents were motivatedto find out more about science directly from theproject, while 97% of respondents would like to11ee a follow-up project with more CERN data.The classification data from the citizen sci-entists have been released for final analysis byschool pupils, in collaboration with the Institutefor Research in Schools.
Acknowledgements
The
HiggsHunters.org project is a collabo-ration between the University of Oxford andthe University of Birmingham in the UnitedKingdom, and NYU in the United States. Itmakes use of the Zooniverse citizen science plat-form, which hosts over 40 projects from searchesfor new astrophysical objects in telescope sur-veys to following the habits of wildlife in theSerengeti. The HiggsHunters project shows col-lisions recorded by the ATLAS experiment anduses software and display tools developed by theATLAS collaboration. The authors gratefullyacknowledge the generous financial support ofthe UK Science and Technology Facilities Coun-cil, the University of Oxford, and Merton Col-lege, Oxford. The project would have been im-possible without the dedicated engagement ofthe many HiggsHunters volunteers and in par-ticular the moderators. We are grateful to PeteWatkins for helpful comments and suggestions.
References [1] ATLAS Collaboration. Observation of anew particle in the search for the Standard Model Higgs boson with the ATLAS detec-tor at the LHC. Phys Lett B. 2012;716:1.[2] CMS Collaboration. Observation of a newboson at a mass of 125 GeV with the CMSexperiment at the LHC. Phys Lett B.2012;716:30.[3] Trumbull DJ, Bonney R, Bascom D,Cabral A. Thinking scientifically duringparticipation in a citizen-science project.Science Education. 2000;84(2):265–275.Available from: http://dx.doi.org/10.1002/(SICI)1098-237X(200003)84:2<265::AID-SCE7>3.0.CO;2-5 .[4] Watson D, Floridi L. Crowdsourced sci-ence: sociotechnical epistemology in the e-research paradigm. Synthese. 2016;p. 1–24. Available from: http://dx.doi.org/10.1007/s11229-016-1238-2 .[5] Zooniverse collaboration; 2016. TheZooniverse project: publications. Avail-able from: .[6] Lintott CJ, Schawinski K, Keel W,Van Arkel H, Bennert N, Edmondson E,et al. Galaxy Zoo: Hanny’s Voorw-erp, a quasar light echo? Monthly No-tices of the Royal Astronomical Society.2009;399(1):129–140.127] Zooniverse collaboration; 2007. Galaxy Zooproject. Available from: .[8] Purcell A; 2004. LHC@home. Availablefrom: http://lhcathome.web.cern.ch/ .[9] ATLAS collaboration; 2014. AT-LAS@HOME. Available from: http://atlasathome.cern.ch/ .[10] CERN; 2014. CERN open data. Availablefrom: http://opendata.cern.ch .[11] ATLAS Collaboration; 2014. Higgs Bo-son Machine Learing Challenge. Avail-able from: .[12] Cox J, Oh EY, Simmons B, Lintott C, Mas-ters K, Greenhill A, et al. Defining andMeasuring Success in Online Citizen Sci-ence: A Case Study of Zooniverse Projects.Computing in Science Engineering. 2015July;17(4):28–41.[13] Riesch H, Potter C. Citizen science as seenby scientists: Methodological, epistemolog-ical and ethical dimensions. Public Un-derstanding of Science. 2014;23(1):107–120.PMID: 23982281. Available from: http://dx.doi.org/10.1177/0963662513497324 .[14] Simpson R, Page KR, De Roure D. Zooni-verse: Observing the World’s Largest Cit-izen Science Platform. In: Proceedings of the 23rd International Conference on WorldWide Web. WWW ’14 Companion. NewYork, NY, USA: ACM; 2014. p. 1049–1054.Available from: http://doi.acm.org/10.1145/2567948.2579215 .[15] ATLAS Collaboration. The ATLAS Sim-ulation Infrastructure. Eur Phys J C.2010;70:823.[16] ATLAS Collaboration. Search for mas-sive, long-lived particles using multitrackdisplaced vertices or displaced lepton pairsin pp collisions at √ s = 8 TeV withthe ATLAS detector. Phys Rev D.2015;92(7):072004.[17] Kalderon CW. Help the Higgs find its sib-lings; 2016. CERN news article, 4th July2016.[18] HiggsHunters collaboration; 2014. https://talk.higgshunters.org .[19] HiggsHunters collaboration; 2014. https://blog.higgshunters.org .[20] Barr AJ, Kalderon CW, Haas AC. ‘Thatlooks weird’ - evaluating citizen scien-tists’ ability to detect unusual featuresin ATLAS images of LHC collisions.2016;arXiv:1610.02214.[21] Institute for Research in Schools; 2016.