Statistics of Solar Wind Electron Breakpoint Energies Using Machine Learning Techniques
Mayur R. Bakrania, I. Jonathan Rae, Andrew P. Walsh, Daniel Verscharen, Andy W. Smith, Téo Bloch, Clare E. J. Watt
AAstronomy & Astrophysics manuscript no. Paper c (cid:13)
ESO 2020July 8, 2020
Statistics of solar wind electron breakpoint energies usingmachine learning techniques
M. R. Bakrania , I. J. Rae , A. P. Walsh , D. Verscharen , , A. W. Smith , T. Bloch and C. E. J. Watt Department of Space and Climate Physics, Mullard Space Science Laboratory, University College London, Dorking, RH5 6NT,UKe-mail: [email protected] European Space Astronomy Centre, Urb. Villafranca del Castillo, E-28692 Villanueva de la Cañada, Madrid, Spain Space Science Center, University of New Hampshire, Durham, NH 03824, USA Department of Meteorology, University of Reading, Reading, RG6 6AE, UKJuly 8, 2020
ABSTRACT
Solar wind electron velocity distributions at 1 au consist of a thermal ‘core’ population and two suprathermal populations: ‘halo’ and‘strahl’. The core and halo are quasi-isotropic, whereas the strahl typically travels radially outwards along the parallel or anti-paralleldirection with respect to the interplanetary magnetic field. Using Cluster-PEACE data, we analyse energy and pitch angle distributionsand use machine learning techniques to provide robust classifications of these solar wind populations. Initially, we used unsupervisedalgorithms to classify halo and strahl di ff erential energy flux distributions to allow us to calculate relative number densities, whichare of the same order as previous results. Subsequently, we applied unsupervised algorithms to phase space density distributions overten years to study the variation of halo and strahl breakpoint energies with solar wind parameters. In our statistical study, we findboth halo and strahl suprathermal breakpoint energies display a significant increase with core temperature, with the halo exhibiting amore positive correlation than the strahl. We conclude low energy strahl electrons are scattering into the core at perpendicular pitchangles. This increases the number of Coulomb collisions and extends the perpendicular core population to higher energies, resultingin a larger di ff erence between halo and strahl breakpoint energies at higher core temperatures. Statistically, the locations of bothsuprathermal breakpoint energies decrease with increasing solar wind speed. In the case of halo breakpoint energy, we observe twodistinct profiles above and below 500 km / s. We relate this to the di ff erence in origin of fast and slow solar wind. Key words. plasmas – methods: statistical – Sun: solar wind
1. Introduction
Solar wind electron velocity distributions at 1 au consist of threemain populations: the thermal ( <
50 eV) population, termed thecore, and two suprathermal ( ∼ ∼ K(Balogh & Smith 2001) and exhibits a nearly Maxwellian veloc-ity distribution. At 1 au, the core contains ∼ ∼
90% in fastwind (Štverák et al. 2009). The halo, on the other hand, exhibits a κ -distribution and forms tails in the total electron velocity distri-bution. The κ -distribution has a similar shape to the Maxwelliandistribution at low thermal velocities. At speeds greater thanthe thermal speed, the κ -distribution decreases as a power law.The κ -distribution of the halo has a greater temperature than theMaxwellian distribution of the core (Feldman et al. 1975). Thecore and halo are quasi-isotropic populations, whereas the strahltravels along the interplanetary magnetic field (IMF) and can beobserved in either the parallel or anti-parallel magnetic field di-rection (Feldman et al. 1978), or in both directions (Gosling et al.1987; Owens et al. 2017), depending on the IMF topology. Thereare also times in which a strahl population is not detectable (An-derson et al. 2012), particularly in slow solar wind (Gurgiolo &Goldstein 2017). The thermal core is thought to form in the corona, as a re-sult of Coulomb collisions and wave-particle interactions (Pier-rard et al. 2001; Vocks et al. 2008). Likewise, suprathermal solarwind electrons originate from the solar corona (Viñas et al. 2000;Che & Goldstein 2014) and then evolve into the strahl and halopopulations as they travel away from the Sun. The majority of thehalo population is formed by the scattering of strahl electrons viaCoulomb collisions (Horaites et al. 2018) and wave-particle in-teractions (Gary et al. 1994; Landi et al. 2012; Vasko et al. 2019;Tong et al. 2019; Verscharen et al. 2019) as it travels outwardsin the solar wind (Saito & Gary 2007; Pagel et al. 2007). Thestrong field-aligned nature of the strahl occurs due to adiabaticfocusing e ff ects (Owens & Forsyth 2013), which are particularlyprevalent at smaller distances from the Sun due to larger gra-dients in magnetic field strength. Adiabatic focusing describesthe change in pitch angle experienced by an electron that slowlytravels into a region with a stronger or weaker magnetic field. Ig-noring any scattering e ff ects, an electron’s pitch angle evolutionwith heliocentric distance then depends on the conservation of itsmagnetic moment. This conservation law results in a decrease inpitch angle with increasing heliocentric distance (Parker 1963;Owens et al. 2008).At 1 au, suprathermal electrons do not undergo any signifi-cant Coulomb collisions (Vocks et al. 2005). This suggests thatadiabatic focusing is the dominant mechanism experienced bythese electrons. Under this assumption, the strahl narrows with Article number, page 1 of 11 a r X i v : . [ phy s i c s . s p ace - ph ] J u l & A proofs: manuscript no. Paper heliocentric distance into a collimated beam of width < ◦ (An-derson et al. 2012). However, the strahl has been observed tobroaden to pitch angles of greater than 20 ◦ at 1 au (Hammondet al. 1996; Anderson et al. 2012; Graham et al. 2017), suggest-ing the presence of additional scattering processes (Berˇciˇc et al.2019). This increase in strahl width with radial distance is notconstant, as observations at both 5.5 au and 10 au show that therate of solar wind electron pitch angle scattering decreases withradial distance (Walsh et al. 2013; Graham et al. 2017).The strahl and halo relative number density ratios vary withradial distance. We use n s , n h , n c and n e to define the strahl, halo,core and total electron number densities respectively. The ra-tio ( n s + n h ) / n e stays approximately constant with heliocentricdistance in both fast and slow wind, according to Štverák et al.(2009), who obtains physical parameters by fitting to electronvelocity distributions. The e ff ect of strahl broadening results in adecrease of n s / n e with increasing heliocentric distance. Concur-rently, n h / n e increases with heliocentric distance (Štverák et al.2009), further indicating a link between the strahl and halo,and that the relevant scattering mechanisms cause the strahl tobroaden and eventually scatter into the halo.Multiple studies (e.g. Feldman et al. 1975; Scudder & Olbert1979; Pilipp et al. 1987b; McComas et al. 1992; Štverák et al.2009) identify the energy above which non-thermal parts of thedistribution deviate from the Maxwellian core. We define thisenergy as the ‘breakpoint energy’, E bp . Particles above a certainenergy experience minimal collisions, creating the non-thermaltails in the electron velocity distribution function and forminghalo and strahl. This ‘breakpoint energy’ is thought to be de-termined primarily by Coulomb collisions (Scudder & Olbert1979). Based on the properties of Coulomb collisions and theinhomogeneity of the solar wind, and assuming minimal wave-particle interactions in the heliosphere, this breakpoint energytheoretically relates to core temperature, T c , and heliocentric ra-dial distance, r , as (Scudder & Olbert 1979): E bp ( r ) = k B T c ( r ) . (1)At 1 au, the average breakpoint energy is ∼
60 eV (Feldman et al.1975), however, its value varies with the local core temperatureand solar wind speed (Štverák et al. 2009). The breakpoint ener-gies between core and halo and between core and strahl are oftendi ff erent. Using electron velocity distribution functions, Štveráket al. (2009) show that the ratio between halo breakpoint en-ergy and core temperature is larger than the ratio between strahlbreakpoint energy and core temperature, across a range of helio-centric distances. At 1 au, Štverák et al. (2009) observe E bp / k B T c ≈ E bp / k B T c ≈ > ∝ r − . , and ranges between 47 eV and 60 eV at 1au, and that E bp / k B T c ≈ ff erences inthe applied methods for the determination of the cut-o ff betweencore and suprathermal distribution functions, this di ff erence isnot significant.Models, which assume an absence of exchange betweenparallel and perpendicular pressure, predict a core temperatureanisotropy in the slow solar wind of T c (cid:107) / T c ⊥ ≈
30, where T c (cid:107) and T c ⊥ are the temperature of the core components in the direc-tion parallel and perpendicular to the magnetic field respectively(Phillips & Gosling 1990). Observations at 1 au, however, finda temperature anisotropy, T c (cid:107) / T c ⊥ ≈ T c ⊥ / T c (cid:107) = ± ff between the thermal and non-thermal parts of a distribution, using only a statistical analysis ofthe data, provides useful limiting parameters for future studieswhich require multi-component fits to the total electron velocitydistribution (Berˇciˇc et al. 2019).Machine learning provides us with a robust method of clas-sification from which fine variations of electron populations inrelation to energy and pitch angle can be derived, with the ad-vantage of not requiring prior assumptions of the distributionsof these populations. Applying machine learning techniques toa large dataset builds upon previous empirical studies of thesuprathermal breakpoint energy. By classifying individual elec-tron distributions, we characterise solar wind electron popula-tions on a higher energy resolution than previous studies. As aresult, our method enables breakpoint energy to be explored fur-ther with respect to other solar wind parameters, and by doingso we draw physical conclusions based on the relationship be-tween this fundamental property and each parameter, for boththe halo and the strahl. Machine learning techniques will becomeincreasingly important with the anticipated volume of high ca-dence electron data from, for example, the Solar Orbiter mission(Müller et al. 2013).
2. Method
In this section, we describe the steps we take in order to classifysolar wind electrons with machine learning techniques, followedby a description of the validation of our method. Firstly, we(1) determine which spacecraft and instruments are best suitedfor this study, and locate data from di ff erent solar wind regimesfor testing. Secondly, we (2) identify possible machine learningmodels to be used to distinguish between electron populations.We then (3) verify the use of these models to find the ‘breakpointenergy’ between suprathermal and core electrons. Following on,we (4) apply these machine learning algorithms to separate haloand strahl electrons based on their energy and pitch angle dis-tributions. Lastly, we (5) calculate relative number densities ofeach population for di ff erent solar wind speeds and compare toprevious studies (Štverák et al. 2009). This allows us to deter-mine the e ff ectiveness of our machine learning models.Steps 3 and 4 are particularly important for our statisticalstudy. We use the method in step 3 to calculate the breakpointenergy in each pitch angle bin and then step 4 to predict whetherthe strahl or halo is dominant at that pitch angle. Article number, page 2 of 11. R. Bakrania et al.: Statistics of solar wind electron breakpoint energies using machine learning techniques
We used data (Laakso et al. 2010) from the PEACE (PlasmaElectron And Current Experiment, Johnstone et al. 1997; Faza-kerley et al. 2010) instrument onboard the Cluster mission’s C2spacecraft (Escoubet et al. 2001). Cluster consists of four space-craft, in tetrahedral formation, each spinning at a rate of 4 s -1 .The PEACE data are recorded with a 4 s time resolution andare based on two instantaneous measurements of the pitch angledistribution per spin. The dataset is a two-dimensional productcontaining twelve 15 ◦ wide pitch angle bins and 44 energy bins,spaced linearly between 0.6 eV to 9.5 eV and logarithmicallyat higher energies. PEACE works by simultaneously recordingelevation bins at two specific azimuth angles separated by 180 ◦ .We initially corrected the PEACE data for spacecraft potential byusing measurements from the Cluster-EFW instrument (Gustafs-son et al. 2001) and corrections according to the results of Cullyet al. (2007). We discarded data from energy bins below the cal-culated spacecraft potential.We used the solar wind speed measurements from theCluster-CIS instrument onboard the C4 spacecraft (Rème et al.2001), while the position and magnetic field measurements aretaken from the Cluster-FGM instrument (Balogh et al. 1997).Using the CIS measurements, we initially separated our inputelectron pitch angle distribution data into three (fast, mediumand slow) solar wind regimes to test our machine learning mod-els. These regimes cover roughly 1-2 hours of data and haveaverage solar wind velocities of 686 km / s, 442 km / s and 308km / s. The time periods we identify with these fast, medium andslow wind regimes are 08:51-10:19 (02 / / / / / / > ff ectivelytrain and test our machine learning models. We predominantly used unsupervised learning algorithms todetermine breakpoint energies, as well as separate halo andstrahl. Unsupervised learning algorithms do not require ‘train-ing’ so they are more time e ffi cient than supervised learningalgorithms. Our choice of algorithm is the K-means clusteringmethod (Arthur 2007) from the scikit-learn library (Pedregosaet al. 2011). Unsupervised learning algorithms have the advan-tage of not needing the user to assign labels to training data,which reduces bias and allows large surveys to be carried outmore e ffi ciently. In the K-means algorithm, the number of clus-ters, K, is manually set to 2 to reflect the number of populationswe aim to distinguish between: a core cluster and a suprather-mal cluster. To calculate the breakpoint energy at a specific pitchangle, our algorithm sorts between energy distributions, at thatpitch angle, and separates the distributions into two groups oneither side of the determined breakpoint energy. We define x i asthe vector representation of the phase space density (PSD) tu-ples, where the index i labels tuples of three subsequent energybins (i.e. energy distributions spanning three energy bins). Wedefine µ j as the vector representation of two random PSD tuples,where the index j labels each cluster. The algorithm sorts theseenergy distributions into clusters by minimising the function: n (cid:88) i = K = (cid:88) j = ω i j (cid:13)(cid:13)(cid:13) x i − µ j (cid:13)(cid:13)(cid:13) , (2) where µ j = (cid:80) ni = ω i j x i (cid:80) ni = ω i j , (3) ω i j = (cid:40) x i belongs to cluster j n is the number of 3-tuples at a fixed pitch angle. As each 3-tuple overlaps with its neighbouring 3-tuples, n = N e −
2, where N e is the number of energy bins at each pitch angle. By minimis-ing the function in Eq. (2), our algorithm calculates the break-point energy by: (1) randomly selecting two PSD vectors in thedataset to become the central points of each cluster, µ j , knownas centroids, (2) assigning all remaining PSD vectors, x i , to theclosest centroid, based on the least-square error between eachvector and the centroids, (3) computing new centroids, µ j , bycalculating the average vector representation of the PSD vectorsassigned to the previous centroid, (4) reassigning each PSD vec-tor, x i , to the new nearest centroid, µ j , and (5) iterating steps 3and 4 until no more reassignments occur.Once the two clusters have been finalised, the breakpoint en-ergy at the relevant pitch angle is determined to be the midpointbetween the uppermost energy bin in the cluster of 3-tuples asso-ciated with lower energies (which represents the core), and low-est energy bin in the cluster of 3-tuples associated with higherenergies (which represents suprathermal electrons). As the PSDdecreases with increasing energy in the relevant energy range,we are able to locate a clear boundary between the two clusters.To separate strahl and halo electrons, we use energy distributionsin conjunction with pitch angle distributions, as discussed belowin Section 2.4. The process of applying our K-means algorithmto pitch angle distributions is analogous to the method describedabove, with x i now representing a pitch angle distribution at acertain energy, however in this case we find the ‘break’ in pitchangle instead. A detailed account of how the K-means algorithmworks is provided by Arthur (2007).We validate our clustering method by comparing test cases toan accurate supervised learning algorithm, trained on a subset ofmanually labelled (as halo or strahl) pitch angle and energy dis-tributions. Once trained, the supervised learning algorithm pre-dicts which class (halo or strahl) a new pitch angle or energydistribution belongs to. We compare supervised learning algo-rithms by calculating their ROC (Receiver operating character-istic) scores (e.g. Flach & Kull 2015). The ROC score comparesa binary classification model’s sensitivity (true positive rate) andspecificity (1 - false positive rate) performance. We find the K-Nearest Neighbours (KNN) (e.g. Peterson 2009) algorithm per-forms best, achieving ROC scores >
90% in all tests. This modelclassifies data by finding the ‘majority vote’ of the nearest (la-belled) neighbours to each unclassified data-point.
We demonstrate the use of unsupervised clustering to calculatethe breakpoint energy. Figure 1, which shows a cut of the di ff er-ential energy flux distribution at constant pitch angle, visualisesthis breakpoint energy. Figure 1 contains three regions with dif-ferent distribution functions. At energies below the spacecraftpotential at ∼
10 eV, photo-electrons dominate (blue dots). Atslightly higher energies, between 10 eV and ∼
45 eV, the distri-bution represents core electrons. At larger energies we observe
Article number, page 3 of 11 & A proofs: manuscript no. Paper the halo population. We fit a Maxwellian (red) and κ -distribution(yellow) (Štverák et al. 2009) to the core and halo respectively,to determine the energy at which the distributions intersect, thatis, the ‘breakpoint energy’. Fig. 1: Di ff erential energy flux as a function of energy at 90 ◦ ,averaged across times 08:51-10:19 (02 / / ± ± ◦ and 180 ◦ , where the strahl carriesthe highest value of the flux density in the suprathermal energyregime. These intersections show a separation between the coreand suprathermal strahl population at 42 ± ◦ pitch an-gle, the algorithm estimates the average breakpoint energy to be45 eV ±
3. The accuracy score between algorithm’s classificationsand a fit to the averaged distribution is 92.9%. As we predict bi-nary classifications, we consider metric scores close to 90% as‘good’ scores when testing our models, based on what previousstudies achieve (e.g. Qian et al. 2015; Zhang et al. 2017).
Figure 2 illustrates a typical di ff erential energy flux distribu-tion as a function of pitch angle and energy distribution for oneparticular time (08:57:28-08:57:32 on 02 / / ◦ and / or 180 ◦ , which represents the halo population at all pitchangles overlaid with field-aligned strahl.From our breakpoint energy analysis, we limit our input datato energies above 44 eV and convert these suprathermal data toPADs across our energy range, e.g. as shown in Figure 3a. Weuse an arbitrary 10-minute subset of time intervals, equivalentto 1800 samples, as training data. We assign each PAD a label,depending on whether strahl is or is not present. Subsequently,the entire set of PADs during our chosen wind speed regimeare classified, based on a trained KNN model. We find a strongagreement between this supervised method and using K-meansto cluster the fast wind set of PADs into two groups (halo andstrahl), with a calculated ROC score of 90.3%.Classifying PADs informs us of whether a strahl is present ata certain energy, however we require classification of the energydistributions at each pitch angle to extract the width of the strahl.The white line (b) in Figure 2 represents the slice from which weobtain the example energy distribution in Figure 3b. We now usea 10-minute interval of energy distributions, at each pitch angle,for our training data and provide labels depending on whetherstrahl is present or not at that pitch angle. We find a strong sim-ilarity between the supervised and unsupervised methods, whenclassifying the entire set of flux-energy distributions, with a ROCscore of 98.3%. This comparison therefore validates the use ofthe unsupervised method for any larger statistical survey.For each time step, we combine the classifications ofsuprathermal PADs and suprathermal energy distributions to cre-ate a grid detailing whether the measured flux in each energy andpitch angle bin is dominated by halo electrons or by strahl elec-trons. A bin is identified as containing strahl if both the PADand energy distribution it resides in are classed as strahl by theK-means algorithm. We show the results of our strahl and haloclassification in fast wind in Figure 4. Each point represents asingle measurement at a given pitch angle and energy, with thecolour depicting the class (halo or strahl). The higher fluxes near0 ◦ and 180 ◦ are associated with strahl (blue points). On occa-sion, broader strahl is detected, as illustrated by the presence ofblue points at higher fluxes near 75 ◦ . The existence of red pointsacross all pitch angles at lower fluxes confirms the presence ofthe halo as an isotropic population.We show the results of our strahl and halo classification inslow wind in Figure 5. We see that the number of blue points,associated with the strahl, is much reduced in the slow wind thanin the fast wind (see Figure 4). This finding is consistent with theobserved lower occurrence of strahl during times of slow solarwind (e.g. Gurgiolo & Goldstein 2017). Both Figures 4 and 5confirm that only halo electrons exist at pitch angles around 90 ◦ .We see for both fast and slow wind cases that the strahl exhibitshigher di ff erential energy fluxes than the halo. The scattering ofstrahl electrons into the halo results in a larger spread of elec-trons across all pitch angles, decreasing the peak flux at any onepitch angle. After classifying the dataset into core, halo and strahl regions,we calculate the di ff erential energy flux attributed to each popu-lation. In order to account for halo electrons in strahl pitch angleand energy bins, we subtract the halo flux, averaged over all pitchangles at a fixed energy, from strahl fluxes at that energy and as-sign it to the total halo flux. Di ff erential energy flux relates to Article number, page 4 of 11. R. Bakrania et al.: Statistics of solar wind electron breakpoint energies using machine learning techniques
Fig. 2:
Two-dimensional colour plot of the measured electron di ff erential energy flux, across a 4 second window (08:57:28-08:57:32on 02 / / ∼
44 eV to ∼
540 eV. The vertical and horizontal white dashed lines represent where cuts are made to obtain: a) thepitch angle distribution at 110.09 eV, and b) the energy distribution at 127.5 ◦ . (a) Pitch Angle Distribution at 110.09 eV (b)
Energy Distribution at 127.5 ◦ Fig. 3: a) Pitch angle distribution at an energy of 110.09 eV and, b) energy distribution at a pitch angle of 127.5 ◦ as projected fromthe vertical and horizontal white lines in Figure 2.the partial number density (cm -3 ) of each electron population asaccording to Eq. (5) (Wüest et al. 2007): ∆ n ≈ . × − E − ∆ E ∆Ω J (cm − ) , (5)where E is the average energy within interval ∆ E (both measuredin keV / Q ) and J is the average di ff erential energy flux (keV / cm -s-str-keV) at energy E . ∆Ω is the solid angle ( ≤ π ) over which J is measured and relates to the pitch angle widths.In Figure 6, we show the conversion of di ff erential energyflux to number density. In slow wind: the ratio n s / n h = n s + n h ) / n c = n s , n h and n c represent the strahl,halo and core number densities. In intermediate wind: n s / n h = n s + n h ) / n c = n s / n h = n s + n h ) / n c = n s + n h ) / n c = ∼ ff erentiating betweensolar wind electron populations to a similar degree as previousresults, with a very di ff erent method.
3. Statistical Study
We then used ten years of pristine solar wind data, from 2001to 2010, to quantify the relationship between strahl and halobreakpoint energies and other solar wind parameters, notably so-lar wind speed and core temperature. By quantifying the haloand strahl breakpoint energies separately, we determine if eachsuprathermal population is governed to the same extent by am-bient conditions, or if they scale with each bulk parameter dif-ferently. For this study, we use Cluster-PEACE data in units ofphase space density and split the data into four-minute intervals.The average solar wind speed during each interval is recordedusing CIS measurements.To confirm that Cluster is in the pristine solar wind, we usedCluster-FGM measurements and a model of the Earth’s bowshock position (Chao et al. 2002). We use this model to identifywhen the spacecraft is outside the bow shock and not magneti-cally connected to it. We ensure Cluster is magnetically discon-nected from the Earth’s bow shock by discarding times when themagnetic field vector at Cluster intersects with the bow shocksurface at any point.
Article number, page 5 of 11 & A proofs: manuscript no. Paper
Fig. 4:
3D scatter plot of the di ff erential energy flux as a function of pitch angle and energy, for the fast solar wind dataset. Thecolours define whether the K-means clustering algorithm labels each bin as either containing strahl and halo flux (blue) or only haloflux (red). Fig. 5:
3D scatter plot of the di ff erential energy flux as a function of pitch angle and energy, for the slow solar wind dataset. Fig. 6: n s / n h and ( n s + n h ) / n c ratios for slow, medium and fastsolar wind.We calculate the halo breakpoint energy, during each four-minute interval, by applying K-means clustering to phase spacedensity values at 90 ◦ pitch angles, over a range of energies from19 eV to 240 eV. Calculating the strahl / core breakpoint energyentails applying these K-means models to pitch angles and in- tervals which contain strahl. We achieve this by classifying flux-energy distributions during each interval, using the method inSection 2.4, to determine if strahl is present at 0 ◦ or 180 ◦ .We fit a Maxwellian velocity distribution function (Štveráket al. 2008b) to core velocities below each strahl or halo break-point energy, to determine the core temperature at that particularpitch angle. This function takes the form: f c = n c (cid:18) m π k (cid:19) / T c ⊥ (cid:112) T c (cid:107) exp − m k v ⊥ T c ⊥ + v (cid:107) T c (cid:107) , (6)where n c is the core density, m the electron mass, k is Boltz-mann’s constant, T c ⊥ and T c (cid:107) are the core perpendicular and par-allel temperatures and v ⊥ and v (cid:107) are the perpendicular and paral-lel velocities. Figure 7 shows the halo breakpoint energy vs. core temperaturedistribution in a ‘violin plot’ to visualise the distribution of datapoints after binning the data into widths of 50 km / s. A violinplot is similar to a box plot, with the addition that the horizontalextend of each violin element represents a density plot of the Article number, page 6 of 11. R. Bakrania et al.: Statistics of solar wind electron breakpoint energies using machine learning techniques data at di ff erent values. The red regions in Figure 7 visualisethese density plots. Fig. 7: ‘Violin plot’ of halo breakpoint energy against core tem-perature. The blue line shows the line of best fit. The white dotsindicate the median of breakpoint energies and the thick blacklines show the inter-quartile ranges (IQR). We plot the thin blacklines to display which breakpoint energies are outliers. Theyspan from Q3 + × IQR to Q1-1.5 × IQR, where Q3 and Q1 arethe upper and lower quartiles, respectively. The horizontal widthof the red regions represents the density of data points at thatgiven breakpoint energy.The widths of the red regions show that data are clusteredabout certain energies across all wind speeds. These regions ofhigher density in fact point to the energy channels (30.1 eV, 37.7eV, 47.9 eV, 56.7 eV and 70.5 eV) within the C2-PEACE instru-ment’s dataset. Figure 7 shows a clear positive correlation be-tween halo breakpoint energy and core temperature, k B T c , witha gradient of 5.74 ± < = ∼
63% of variation in halo breakpoint energy can be describedby this correlation. Very small inter-quartile ranges are observedin the 1-2 eV and 5-6 eV bins, while large inter-quartile rangesare observed in bins 4-5 eV and 6-7 eV. The results for the strahlbreakpoint energy vs. core temperature are shown in Figure 8.
Fig. 8: ‘Violin plot’ of strahl breakpoint energy against core tem-perature. The orange line shows the line of best fit. The remain-ing features are the same as in Figure 7.In both Figures 7 and 8, there is small discrepancy betweenthe line of best fit and the median at core temperatures between 2 eV and 8 eV. When T c < c > ff ers. This is evidenced bythe strahl breakpoint energy relation exhibiting a smaller gradi-ent (5.5 ± c than the halo’s relation. A p-value of < = ff ering solar wind source regions. Thegradient in Figure 9 is -5.9 ± / s. The R-squaredvalue of 0.487 is lower than 0.626 in Figure 7, indicating thathalo breakpoint energy exhibits a stronger correlation with coretemperature than with solar wind speed. A statistical P-test pro-duces a p-value of < = Fig. 9: ‘Violin plot’ of halo breakpoint energy against solar windspeed. The blue line shows the line of best fit. The remainingfeatures are the same as in Figure 7.The distribution of breakpoint energies with wind speed inFigure 9 displays a step function at about 500 km / s. The lowerquartile within the 450-500 km / s bin lies above the upper quar-tiles in faster speed bins. Fitting two linear fits to solar windspeeds below and above 500 km / s separately produces gradientsof -4.2 ± / s and -3.5 ± / s re-spectively. The associated R-squared values are 0.588 and 0.651respectively; both larger than a value 0.487 for a single linearfit, indicating that two separate correlations better describe thedistribution in Figure 9 than a single correlation. The two corre-lations are also significant at the p = Article number, page 7 of 11 & A proofs: manuscript no. Paper of data-points, the variance about the median values is relativelysmall, with the exception of a few outliers. The medians them-selves do not deviate significantly from the line of best fit acrossall wind speeds, with the largest median residual equalling 5 eVin the <
300 km / s bin. There is some evidence for positive ornegative skewness at certain solar wind velocities, such as in the <
300 km / s and 400-450 km / s bins, as can be seen when the me-dian appears to lie on one of the edges of the inter-quartile range.Figure 10 shows the strahl breakpoint energy variation withsolar wind speed. According to our linear fit, the rate of decreaseof strahl breakpoint energy with solar wind speed is -5.7 ± / s. Solar wind speed has a smaller correlation withstrahl breakpoint energy than halo breakpoint energy, based onthe steepness of each gradient and R-squared values. This R-squared value of 0.460 in Figure 10 also indicates that the strahlbreakpoint energy has a weaker correlation statistically with so-lar wind speed than with core temperature, as the line of best fitdescribes less of the variation. This is also the case for the halobreakpoint energy. A p-value of < = Fig. 10: ‘Violin plot’ of strahl breakpoint energy against solarwind speed. The orange line shows the line of best fit. The re-maining features are in the same format as Figure 7.Similar to Figure 9, the variation in breakpoint energy inthe strahl violin plot is larger at smaller wind speeds. However,unlike for halo, the 400-450 km / s bin has a much larger vari-ance than the <
300 km / s bin, as evidenced by their inter-quartileranges. This larger spread of data at medium wind speeds ex-plains why the strahl’s R-squared value is lower than the halo’s.The lack of skewness in Figure 10 shows that the data are dis-tributed more symmetrically in the strahl’s case than the halo’s.The sum of the median residuals are also smaller for the strahl,with the largest median residual at 3.5 eV in the <
300 km / ssolar wind speed bin. A step function is less apparent in Fig-ure 10, however there is a clear distinction between the medianbreakpoint energy relation with wind speed in slow winds ( < / s), compared to fast winds. Table 1 contains the gradientsand R-squared values of the correlations in Figures 7, 8, 9, and10.
4. Discussion
In this study, we use the K-means algorithm to successfully dis-tinguish between the three populations and we train a supervisedlearning algorithm (K-nearest neighbours) to classify a subset ofthe pitch angle and energy distributions. There is a strong agree-ment between the two machine learning methods, allowing us
Table 1:
Correlations between halo and strahl breakpoint ener-gies with core temperature, T c and solar wind speed, V sw , as rep-resented by the gradients and R-squared, R , values. T c V sw Population E bp / T c R E bp / V sw R [eV / (km / s)]Halo 5.74 0.626 -0.059 0.487Strahl 5.5 0.51 -0.057 0.460to apply the K-means clustering method to a larger subset ofsolar wind electron data at di ff erent solar wind velocities. Ma-chine learning algorithms provide us with an e ffi cient method ofclassification from which small scale variations of electron pop-ulations in relation to energy and pitch angle can be derived. Byclassifying a single distribution at each time step, we build upa high resolution picture of suprathermal breakpoint energy andrelative number density, including how they evolve with di ff er-ent parameters. The techniques we employ can be easily appliedto any classification problem where su ffi cient data are available.Distinguishing between strahl, halo, and core electron popu-lations allows us to calculate their relative number densities, inorder to compare our method to previous results. Štverák et al.(2009) show that suprathermal electrons in the fast wind con-stitute ∼
10% of the total electron number density, while in slowwind they occupy 4% to 5% of the total electron density. In com-parison, we obtain values of ∼ n s / n h and wind speed in Figure 6. Strahl in slow solar wind un-dergoes more scattering per unit distance than in faster wind (e.g.Fitzenreiter et al. 1998), leading to a higher value of n h / n e at 1au. We observe a near absence of strahl in very slow solar windat velocities of 308 km / s (see Figures 5 and 6), which is con-sistent with observations from previous studies (e.g. Fitzenreiteret al. 1998; Gurgiolo & Goldstein 2017; Graham et al. 2018). Byanalysing a number of periods of slow solar wind, Fitzenreiteret al. (1998) find that the strahl generally has a larger width inslow solar winds than fast, while Gurgiolo & Goldstein (2017)find that strahl is often not present at solar wind velocities (cid:46) / s. Graham et al. (2018) also note an absence of strahl dur-ing certain slow solar wind times. This absence of strahl remainsunexplained. Possible hypotheses include: Coulomb pitch anglescattering which counteracts magnetic focussing e ff ects duringstrahl formation (Horaites et al. 2018), intense scattering due tobroadband whistler turbulence (Pierrard et al. 2001), and the lackof initial strahl formation during the production of slow solarwind (Gurgiolo & Goldstein 2017). Article number, page 8 of 11. R. Bakrania et al.: Statistics of solar wind electron breakpoint energies using machine learning techniques
Instead of finding the intersection between core andsuprathermal fitting functions (e.g. Pilipp et al. 1987a; McCo-mas et al. 1992; Štverák et al. 2009), a method which accordingto McComas et al. (1992) produces ‘somewhat arbitrary’ val-ues, our method calculates the breakpoint energy based on thedata recorded in each individual pitch angle and energy bin. Ourmethod calculates breakpoint energy values of both sunward andanti-sunward strahl, occasionally obtaining two strahl breakpointenergy values at a single time if bi-directional strahl is present.An alternative method is presented by Štverák et al. (2009) whodiscard sunward strahl in their calculations of the strahl E bp / k B T c ratio at each radial distance. By characterising both sunward andanti-sunward strahl, our method significantly improves the char-acterisation of all electron beams in the solar wind.Our work on the core velocity distribution functions eluci-dates the relative correlation between core temperature, T c , andboth halo and strahl breakpoint energies. Using core temperatureas a reference point enables us to predict to what extent strahland halo characteristics scale to characteristics of the core. Thecore temperature has a strong correlation with both suprathermalbreakpoint energies, with the halo breakpoint energy exhibit-ing a closer correlation than the strahl’s. Both halo and strahlbreakpoint energies statistically have a stronger correlation withcore temperature than with solar wind speed. The gradients be-tween breakpoint energy and core temperature are calculated as5.74 ± ± E bp / k B T c =
7, which di ff ers from our scaling factor of E bp / k B T c = ± E bp / k B T c =
7, Scudder &Olbert (1979) predict that a transformation of thermal electronsinto the suprathermal population occurs as the solar wind flowsout from the Sun. Findings by Štverák et al. (2009), on the otherhand, show that the ( n h + n s ) / n c ratio remains roughly constantwith heliocentric distance in the slow wind, suggesting a lack ofinterchange between the thermal and suprathermal populations.However Štverák et al. (2009) observes some variability in the( n h + n s ) / n c ratio in the fast wind, which they attribute to eitherstatistical e ff ects due to a lack of samples or a possible ‘inter-play’ between thermal and suprathermal electrons. Scudder &Olbert (1979) also predict that the halo E bp / k B T c ratio remainsconstant with heliocentric distance, whereas Štverák et al. (2009)find that the halo E bp / k B T c ratio decreases with heliocentric dis-tance. These findings by Štverák et al. (2009), along with thediscrepancy between our calculated ratio of E bp / k B T c = ± E bp / k B T c =
7, suggest that the model ofScudder & Olbert (1979) requires a minor update to either thetheory or to the input parameters. The discrepancy, however, mayalso be indicative of other processes, such as wave-particle scat-tering (e.g. Gary et al. 1994), that possibly modifies the ratio be-tween breakpoint energy and core temperature while preservingits linear relationship.In our statistical study, we find that both strahl and halobreakpoint energies decrease with solar wind speed. At all solarwind velocities, as well as core temperatures, the halo breakpointenergy is larger than the strahl’s at equivalent velocities and tem-peratures. The halo breakpoint energy exhibits a higher correla- tion with the solar wind speed than strahl. The anti-correlationbetween the two parameters corresponds with the finding that( n h + n s ) / n c increases with solar wind speed (Štverák et al. 2009),where n h , n s and n c represent the halo, strahl, and core num-ber densities. Assuming all plasma parameters are kept constant,except for the core density and temperature, the relative den-sity of suprathermal electrons will increase if the breakpoint en-ergy decreases. This observed relationship between solar windspeed and electron ratios is most likely a result of the lowercollisionality of fast solar wind (Scudder & Olbert 1979; Lie-Svendsen et al. 1997; Salem et al. 2003; Gurgiolo & Goldstein2017), which results in more distinctive non-thermal features ofthe electron velocity distribution function. Further work is re-quired to analyse whether di ff erent breakpoint energy relationsexist that depend on the source of solar wind. Initial findings inthis paper suggest the existence of two distinct relationships inthe halo breakpoint energy vs. wind speed distribution, with astep function at 500 km / s. This finding links to a sharp distinc-tion between fast and slow solar winds (Feldman et al. 2005).Therefore the origin of the solar wind, i.e., coronal holes for fastwind or streamer belt regions for slow wind, potentially plays arole in the definition of thermal and non-thermal electron popu-lations. A step function is less obvious in the strahl breakpointenergy vs. solar wind speed distribution.
5. Conclusions
In this study, we apply unsupervised K-means clustering algo-rithms to Cluster-PEACE data to separate solar wind electronpitch angle and energy distributions into the core, halo, and strahlpopulations. This enables us to perform an accurate statisticalanalysis of strahl and halo breakpoint energies. In our statisticalstudy, we compare the relationship between core temperature, T c and both halo and strahl breakpoint energies. We present astrong correlation between suprathermal breakpoint energies and T c , and conclude this is due to core temperature being a deter-mining factor for breakpoint energy. As a result of higher coretemperatures, the Maxwellian part of the total electron velocitydistribution function, which represents the core, extends acrossa wider range of velocity space (Pilipp et al. 1987a). The coredistribution therefore overlaps with the halo and strahl at higherenergies and thus increases the suprathermal breakpoint energy.We find that halo breakpoint energy remains larger than thestrahl’s across all temperatures. This di ff erence between halo andstrahl breakpoint energies suggests that there are certain ener-gies, below the halo breakpoint energy, at which a strahl andcore population are both present. At these energies, strahl dom-inates at parallel pitch angles and core dominates at perpendic-ular pitch angles. Wave-particle scattering processes (Gary et al.1994; Vasko et al. 2019; Verscharen et al. 2019) scatter theselow energy strahl electrons to higher perpendicular velocitiesand smaller parallel velocities. At su ffi ciently high core temper-atures, these strahl electrons would be absorbed by the core pop-ulation (Pilipp et al. 1987b), instead of the higher energy halopopulation. The absorption of strahl electrons by the core in-creases the number of Coulomb collisions (Landi et al. 2012),which then leads to an increase in core temperature (Marsch &Goldstein 1983; Boldyrev et al. 2020). This scenario is consis-tent with previous studies (Pilipp et al. 1987c) which show atransfer of electron kinetic energy from the parallel to perpendic-ular direction, increasing core temperature in the perpendiculardirection. The increase of core temperature, due to the absorp-tion of strahl electrons, acts to extend the core component of theelectron velocity distribution function to higher velocities (Pilipp Article number, page 9 of 11 & A proofs: manuscript no. Paper et al. 1987a), therefore increasing the halo breakpoint energy atpitch angles at which the strahl is not present. This phenomenonexplains the larger di ff erence between strahl and halo breakpointenergies at higher core temperatures, as a larger di ff erence inbreakpoint energy means more strahl electrons are scattering intothe core population rather than the halo population.This work signifies the first extensive study in characterisingthe relation between breakpoint energy and solar wind speed, foreach of the suprathermal populations. Our results show there isa significant decrease in both halo and strahl breakpoint ener-gies with increasing solar wind speed, with the halo relation ex-hibiting a stronger correlation. We find two distinct relationshipsin the halo breakpoint energy vs. solar wind speed distribution,with a step function at 500 km / s. We predict this step functionrelates to the di ff erence in origin of fast and slow solar windelectrons (Feldman et al. 2005). Further investigation, with theaid of new facilities provided by the Parker Solar Probe and So-lar Orbiter missions, can test this prediction and investigate whythe step function is prevalent in the halo breakpoint energy rela-tionship but not in the strahl breakpoint energy relationship withsolar wind speed. In future studies, using Solar Orbiter measure-ments at smaller heliocentric distances will allow us to bettercharacterise halo and strahl breakpoint energies and improve ourunderstanding of their dependence on bulk solar wind parame-ters. Acknowledgements.
M.R.B. is supported by a UCL Impact Studentship, jointfunded by the ESA NPI programme. I.J.R., D.V. and A.W.S. are supportedby STFC Consolidated Grant ST / S00240 /
1. D.V. is supported by the STFCErnest Rutherford Fellowship ST / P003826 /
1. A.W.S. is supported by NERCgrant NE / P017150 /
1. T.B. is supported by STFC Training Grant ST / R505031 / / R000921 / / P017274 /
1. We thank the Cluster instrument teams (PEACE, FGM, CIS,EFW) for the data used in this study, in particular the PEACE operations teamat the Mullard Space Science Laboratory. Data for the Cluster spacecraft canbe obtained from the Cluster Science Archive (https: // csa.esac.esa.int / csa-web / ).We thank the anonymous reviewer for their many useful contributions to thismanuscript. This work was discussed at the 2019 ESAC Solar Wind ElectronWorkshop, which was supported by the Faculty of the European Space Astron-omy Centre (ESAC). References
Anderson, B. R., Skoug, R. M., Steinberg, J. T., & McComas, D. J. 2012, Journalof Geophysical Research: Space Physics, 117Arthur, D. 2007, Proceedings of the eighteenth annual ACM-SIAM symposiumon Discrete algorithms, 1027Balogh, A., Dunlop, M. W., Cowley, S. W. H., et al. 1997, Space Science Re-views, 79, 65Balogh, A. & Smith, E. J. 2001, in The 3-D Heliosphere at Solar Maximum(Dordrecht: Springer Netherlands), 147–160Berˇciˇc, L., Maksimovi´c, M., Landi, S., & Matteini, L. 2019, Monthly Notices ofthe Royal Astronomical Society, 486, 3404Boldyrev, S., Forest, C., & Egedal, J. 2020, Proceedings of the NationalAcademy of SciencesChao, J., Wu, D., Lin, C.-H., et al. 2002, in COSPAR Colloquia Series, Vol. 12,Space Weather Study Using Multipoint Techniques (Pergamon), 127 – 135Che, H. & Goldstein, M. L. 2014, The Astrophysical Journal, 795, L38Cully, C. M., Ergun, R. E., & Eriksson, A. I. 2007, Journal of Geophysical Re-search: Space Physics, 112Dulk, G. A. & Marsh, K. A. 1982, ApJ, 259, 350Escoubet, C. P., Fehringer, M., & Goldstein, M. 2001, Annales Geophysicae, 19,1197Fazakerley, A. N., Lahi ff , A. D., Wilson, R. J., et al. 2010, in The Cluster ActiveArchive, ed. H. Laakso, M. Taylor, & C. P. Escoubet (Springer Netherlands),129–144Feldman, U., Landi, E., & Schwadron, N. A. 2005, Journal of Geophysical Re-search: Space Physics, 110Feldman, W. C., Asbridge, J. R., Bame, S. J., Gosling, J. T., & Lemons, D. S.1978, Journal of Geophysical Research: Space Physics, 83, 5285Feldman, W. C., Asbridge, J. R., Bame, S. J., Montgomery, M. D., & Gary, S. P.1975, Journal of Geophysical Research (1896-1977), 80, 4181 Fitzenreiter, R. J., Ogilvie, K. W., Chornay, D. J., & Keller, J. 1998, GeophysicalResearch Letters, 25, 249Flach, P. A. & Kull, M. 2015, Proceedings of the 28th International Conferenceon Neural Information Processing Systems, 1, 838Gary, S. P., Scime, E. E., Phillips, J. L., & Feldman, W. C. 1994, Journal ofGeophysical Research: Space Physics, 99, 23391Geiss, J., Gloeckler, G., & von Steiger, R. 1995, Space Sci Rev, 72, 49Gosling, J. T., Baker, D. N., Bame, S. J., et al. 1987, Journal of GeophysicalResearch: Space Physics, 92, 8519Graham, G. A., Rae, I. J., Owen, C. J., & Walsh, A. P. 2018, The AstrophysicalJournal, 855, 40Graham, G. A., Rae, I. J., Owen, C. J., et al. 2017, Journal of Geophysical Re-search: Space Physics, 122, 3858Gurgiolo, C. & Goldstein, M. L. 2017, Annales Geophysicae, 35, 71Gustafsson, G., André, M., Carozzi, T., et al. 2001, Annales Geophysicae, 19,1219Habbal, S. R., Woo, R., Fineschi, S., et al. 1997, The Astrophysical Journal, 489,L103Hammond, C. M., Feldman, W. C., McComas, D. J., Phillips, J. L., & Forsyth,R. J. 1996, aap, 316, 350Horaites, K., Boldyrev, S., & Medvedev, M. V. 2018, Monthly Notices of theRoyal Astronomical Society, 484, 2474Johnstone, A. D., Alsop, C., Burge, S., et al. 1997, Peace: A Plasma Electron andCurrent Experiment (Dordrecht: Springer Netherlands), 351–398Kajdiˇc, P., Alexandrova, O., Maksimovic, M., Lacombe, C., & Fazakerley, A. N.2016, ApJ, 833, 172Laakso, H., Perry, C., McCa ff rey, S., et al. 2010, in The Cluster Active Archive,ed. H. Laakso, M. Taylor, & C. P. Escoubet (Springer Netherlands), 3–37Landi, S., Matteini, L., & Pantellini, F. 2012, The Astrophysical Journal, 760,143Lie-Svendsen, Ø., Hansteen, V. H., & Leer, E. 1997, Journal of GeophysicalResearch: Space Physics, 102, 4701Maksimovic, M., Zouganelis, I., Chaufray, J.-Y., et al. 2005, Journal of Geophys-ical Research: Space Physics, 110Marsch, E. & Goldstein, H. 1983, Journal of Geophysical Research: SpacePhysics, 88, 9933McComas, D. J., Bame, S. J., Barraclough, B. L., et al. 1998, Geophysical Re-search Letters, 25, 1McComas, D. J., Bame, S. J., Feldman, W. C., Gosling, J. T., & Phillips, J. L.1992, Geophysical Research Letters, 19, 1291Müller, D., Marsden, R. G., St. Cyr, O. C., Gilbert, H. R., & The Solar OrbiterTeam. 2013, Solar Physics, 285, 25Owens, M. J., Crooker, N. U., & Schwadron, N. A. 2008, Journal of GeophysicalResearch: Space Physics, 113Owens, M. J. & Forsyth, R. J. 2013, Living Reviews in Solar Physics, 10, 5Owens, M. J., Lockwood, M., Riley, P., & Linker, J. 2017, Journal of Geophysi-cal Research: Space Physics, 122, 10,980Pagel, C., Gary, S. P., de Koning, C. A., Skoug, R. M., & Steinberg, J. T. 2007,Journal of Geophysical Research: Space Physics, 112Parker, E. N. 1963, Interplanetary dynamical processes. (New York: IntersciencePublishers)Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, Journal of MachineLearning Research, 12, 2825Peterson, L. 2009, Scholarpedia, 4, 1883Phillips, J. L. & Gosling, J. T. 1990, Journal of Geophysical Research: SpacePhysics, 95, 4217Pierrard, V., Maksimovic, M., & Lemaire, J. 2001, Astrophysics and Space Sci-ence, 277, 195Pilipp, W. G., Miggenrieder, H., Montgomery, M. D., et al. 1987a, Journal ofGeophysical Research: Space Physics, 92, 1075Pilipp, W. G., Miggenrieder, H., Montgomery, M. D., et al. 1987b, Journal ofGeophysical Research: Space Physics, 92, 1093Pilipp, W. G., Miggenrieder, H., Mühlhäuser, K. H., et al. 1987c, Journal of Geo-physical Research: Space Physics, 92, 1103Qian, Y., Zhou, W., Yan, J., Li, W., & Han, L. 2015, Remote Sensing, 7, 153Rème, H., Aoustin, C., Bosqued, J. M., et al. 2001, Annales Geophysicae, 19,1303Rice, W. R. 1990, Biometrics, 46, 303Saito, S. & Gary, S. P. 2007, Geophysical Research Letters, 34Salem, C., Hubert, D., Lacombe, C., et al. 2003, The Astrophysical Journal, 585,1147Scudder, J. D. & Olbert, S. 1979, Journal of Geophysical Research: SpacePhysics, 84, 2755Tong, Y., Vasko, I. Y., Pulupa, M., et al. 2019, The Astrophysical Journal, 870,L6Vasko, I. Y., Krasnoselskikh, V., Tong, Y., et al. 2019, The Astrophysical Journal,871, L29Verscharen, D., Chandran, B. D. G., Jeong, S.-Y., et al. 2019, The AstrophysicalJournal, 886, 136 Article number, page 10 of 11. R. Bakrania et al.: Statistics of solar wind electron breakpoint energies using machine learning techniques