[PDF] A Gaussian Mixture Model for Nulling Pulsars

Abstract

The phenomenon of pulsar nulling -- where pulsars occasionally turn off for one or more pulses -- provides insight into pulsar-emission mechanisms and the processes by which pulsars turn off when they cross the "death line." However, while ever more pulsars are found that exhibit nulling behavior, the statistical techniques used to measure nulling are biased, with limited utility and precision. In this paper we introduce an improved algorithm, based on Gaussian mixture models, for measuring pulsar nulling behavior. We demonstrate this algorithm on a number of pulsars observed as part of a larger sample of nulling pulsars, and show that it performs considerably better than existing techniques, yielding better precision and no bias. We further validate our algorithm on simulated data. Our algorithm is widely applicable to a large number of pulsars even if they do not show obvious nulls. Moreover, it can be used to derive nulling probabilities of nulling for individual pulses, which can be used for in-depth studies.

Full PDF

DDraft version January 30, 2018

Typeset using L A TEX twocolumn style in AASTeX62

A Gaussian Mixture Model for Nulling Pulsars

D. L. Kaplan, J. K. Swiggum, T. D. J. Fichtenbauer, and M. Vallisneri

2, 3 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin–Milwaukee, P.O. Box 413,Milwaukee, WI 53201, USA Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109, USA TAPIR, California Institute of Technology, Pasadena, CA 91125, USA (Accepted January 27, 2018)

Submitted to ApJABSTRACTThe phenomenon of pulsar nulling – where pulsars occasionally turn oﬀ for one or more pulses –provides insight into pulsar-emission mechanisms and the processes by which pulsars turn oﬀ whenthey cross the “death line.” However, while ever more pulsars are found that exhibit nulling behavior,the statistical techniques used to measure nulling are biased, with limited utility and precision. In thispaper we introduce an improved algorithm, based on Gaussian mixture models, for measuring pulsarnulling behavior. We demonstrate this algorithm on a number of pulsars observed as part of a largersample of nulling pulsars, and show that it performs considerably better than existing techniques,yielding better precision and no bias. We further validate our algorithm on simulated data. Ouralgorithm is widely applicable to a large number of pulsars even if they do not show obvious nulls.Moreover, it can be used to derive nulling probabilities of nulling for individual pulses, which can beused for in-depth studies.

Keywords: methods — data analysis, methods — statistical, pulsars — general INTRODUCTIONPulsar surveys to date have found ∼ , most of which move across the P – ˙ P diagram and turn oﬀ in ∼ yr when they passthe “death line” (see, e.g., Lorimer & Kramer 2012).Just how and why pulsars turn oﬀ is still a subject ofintense scrutiny, with ongoing observational and the-oretical investigations (see references below). Nullingpulsars (Backer 1970) – pulsars whose radio emissionceases temporarily for one or more rotations – oﬀer aninvaluable laboratory to study pulsar emission mecha-nisms and magnetospheres. The meaning of null dura-tions, the intervals between them, and the underlyingmechanism have been matters of debate ever since thebehavior was recognized.The ﬁrst comprehensive study of nulling pulsars(Ritchings 1976) suggested that as a pulsar ages, the Corresponding author: D. L. [email protected] time interval between regular bursts of pulsed emissionincreases, eventually leading to “death” when the inter-val between bursts is much longer than the duration ofthe bursts themselves. A later study of 72 pulsars founda stronger correlation between null fraction (NF; thefraction of time that a pulsar spends in a null state) andspin period, but still argued that nulling could be in-dicative of a faltering emission mechanism (Biggs 1992).Wang et al. (2007) argued on the basis of a smallersample of 45 nulling pulsars that nulling behavior wasmore related to a large characteristic age than otherparameters, but the analysis was not quantitative.However, while data-collection capabilities, processingtechniques, and theories explaining pulsar nulling (e.g.,Gajjar et al. 2012) have become more sophisticated, weare still in a regime where increasing the sample sizeof nullers along with the precision of the nulling anal-ysis can have a signiﬁcant impact on our understand-ing. We are working to increase the sample size throughdetailed followup observations of sources found in theGreen Bank North Celestial Cap survey (GBNCC; Sto-vall et al. 2014; Lynch et al. 2018; Kawash et al. 2018). a r X i v : . [ a s t r o - ph . I M ] J a n Kaplan et al.

At the same time, the classiﬁcation of pulsars into thosethat do or do not null is largely based on qualitative ex-amination by eye, and computation of the nulling frac-tion suﬀers from signiﬁcant limitations that may biasthe results.Here we present a robust technique for determiningnulling fractions using Gaussian mixture models (see,e.g., Ivezi´c et al. 2014). As we demonstrate, the ro-bustness of this technique allows us to infer quantitativelimits on nulling fractions even for pulsars that have noobvious nulls, so the technique can be used for more so-phisticated population analyses. First, we describe thesample data that we used to test our technique ( § § § § §

4. All of our source code is available at10.5281/zenodo.1155855. Note that we believe a similaralgorithm was applied to nulling pulsars by Arjunwad-kar et al. (2014), but we have been unable to ﬁnd detailsof its implementation or results. SAMPLE DATATo test our method, we gathered timing data froma recent study of new pulsar discoveries in the GBNCCsurvey (Lynch et al. 2018). Data were collected with the100-m Robert C. Byrd Green Bank Telescope (GBT),observing at a center frequency of 800 MHz. We digi-tized a 200 MHz bandwidth using the Green Bank Ul-timate Pulsar Processing Instrument (GUPPI; DuPlainet al. 2008) in incoherent search mode, with 2,048 fre-quency channels and sampling every 40.96 µ s. Pulsarsincluded in this study (PSRs J0323+6742, J0054+6946,and J0137 + 6349) were observed monthly for 3-minuteexposures between February 2013 and January 2014.Exposure times for PSR J0054 + 6946 were lengthenedto 6 minutes starting in November 2013 through May2014.For each pulsar, data were folded modulo the pulseperiod using DSPSR and RFI was zapped interactivelyusing the pazi routine from

PSRCHIVE . A crucial stepin nulling studies is the choice of on-pulse and oﬀ-pulse windows, deﬁned as ﬁxed phase intervals (within foldedproﬁles) that are assumed to contain most and none,respectively, of the pulsar emission. Loosely speaking,the intensities observed in the oﬀ-pulse window are at-tributed to background noise setting the threshold foron-pulse–window intensities to be classiﬁed as nullingor emitting. We set the width and phase of on-pulsewindows manually by inspecting folded proﬁles, choos-ing oﬀ-pulse windows of the same size at pulse phasescontaining no visible emission, often about 0.5 rotations A m p li t ud e P u l s e N u m b e r O FF O N NULLEMIT 0 . . . Figure 1.

Amplitude versus pulse phase and pulse numberfor a subset of the data on PSR J0323 + 6742. The top panelshows the average pulse proﬁle, with the on- and oﬀ-pulsephase windows delineated by the vertical lines. The bottom-left panel shows the amplitude for individual pulses, with thephase windows indicated. The right panel shows the proba-bility of nulling (Eqn. 5). We highlight regions with signiﬁ-cant nulling behavior (20 successive pulses with minimum aposteriori null probability greater than 62%) and relativelysteady emission (10 pulses with maximum a posteriori nullprobability ∼ from the on-pulse window, but careful not to includeany interpulse (if present).We ﬁtted and removed a 6th-order polynomial to ﬂat-ten the baseline of each single pulse proﬁle and center theoﬀ-pulse noise on zero intensity (similar to Lynch et al.2013; Rosen et al. 2013). Afterwards, we assembled ourdataset of on and oﬀ intensities for each individual pulseby integrating over each phase bin and concatenatingthe results from the separate 3-min/6-min exposures.See Figure 1 for an illustration of the pulse intensitiesmeasured in on- and oﬀ-pulse windows, along with somesigniﬁcant nulling behavior. A GAUSSIAN MIXTURE MODEL FORNULLING PULSARSThe standard algorithm for ﬁtting the nulling frac-tion NF (Ritchings 1976) is to ﬁrst construct his-tograms of the integrated intensities I ON ≡ { I ON k } and I OFF ≡ { I OFF k } measured in the on-pulse andoﬀ-pulse windows at each observation k . We denotethe histograms as ON n ( I ON ) and OFF n ( I OFF ), wherethe index n identiﬁes histogram bins. Then, for aseries of trial values of NF one computes the diﬀer- aussian Mixtures for Nulling Pulsars n ( I ON , I OFF ) = ON n ( I ON ) − NF × OFF n ( I OFF ). The best-ﬁt ˆNF is the value that mini-mizes the sum of ∆ n ( I ON , I OFF ) over negative intensi-ties, | (cid:80) I n < ∆ n ( I ON , I OFF ) | , where the nulling is pre-sumed to dominate. This method has a number ofdrawbacks: ﬁrst, it requires construction of histogramsso there is an arbitrary choice of binning. Second, itassumes that the pulses with I ON k < Fitting Algorithm

In our proposed method, we deﬁne the likelihood ofthe on-pulse dataset I ON as a one-dimensional Gaussianmixture model, p ( I ON |{ w j , µ j , σ j } ) = N (cid:89) k  M (cid:88) j =1 w j N ( I ON k | µ j , σ j )  , (1)where µ j and σ j are the means and standard deviationsof M normal distributions, and w j are their weights inthe mixture, which must satisfy (cid:80) w j = 1. Thereforethere are 3 M − M = 2 corresponds to the standard descrip-tion of nulling with two modes; we order results so thatthe j = 1 component has the lowest mean and describesthe nulls, so NF = w . Note that Eq. (1) implies thatthe error in the measurement of the intensities is neg-ligible compared to the intrinsic scatter in the mixturecomponents, described by the σ j . The parameters that maximize this likelihood canbe found using the expectation-maximization algorithm(Dempster et al. 1977), implemented as

GaussianMixture in scikit-learn (Pedregosa et al. 2011). Using thisimplementation, we ﬁnd reasonable results for pulsarswith signiﬁcant separations between nulling and emit-ting pulses. Such separation may be a signal-to-noiseeﬀect, but scatter in pulse intensities could also resultfrom intrinsic variability in the pulsars themselves or in-terstellar scintillation (Rickett 1990; Jenet & Gil 2003).However, as we show below the results are not ideal forother pulsars.We can do better by using the oﬀ-pulse dataset I OFF to constrain the null-component parameters µ and σ ,by way of the oﬀ-pulse likelihood p ( I OFF | µ , σ ) = (cid:89) k N ( I OFF k | µ , σ ) . (2) If the measurement error is not negligible, the σ j are eﬀectivelyredeﬁned to include it, under the assumptions that the error issimilar for every observation. Indeed, we may think of this step as providing a priordistribution for µ and σ , which is then used in Eqn. 1.We then explore the { w j , µ j , σ j } parameter space usingMarkov Chain Monte Carlo (MCMC) techniques, specif-ically the aﬃne-invariant population-MCMC algorithm emcee described by Foreman-Mackey et al. (2013). Thedetails of our implementation are as follows: • We initialize 40 emcee “walkers” around the best-ﬁt region for w j , µ j , σ j , as determined by expecta-tion maximization run on I ON k . • For simplicity, we set the µ and σ priors as nor-mal distributions centered on k (cid:80) k I OFF k and onthe inner-quartile range of the I OFF , respectively,with widths determined following Ahn & Fessler(2003). • Prior distributions for µ j and σ j (with j >

1) aretaken to be ﬂat. The prior distribution for theweights w j is a Dirichlet distribution, but sincethe sum of the weights is 1 it is eﬀectively ﬂat. • We run the walkers through 50 iterations toachieve “burn in.” • Last, we run the walkers for 500 iterations to ob-tain the ﬁnal population, representative of the µ j , σ j , and w j joint posterior.We experimented with increasing the number of walk-ers and iterations and found that the values above gavesuﬃciently reliable results for data-sets with a few thou-sand pulses and M <

5, but they can be increased asneeded to achieve reliable posterior distributions.3.2.

Fit Results

Representative results from this algorithm are shownin Figure 2, where we plot p ( I ON |{ ˆ w j , ˆ µ j , ˆ σ j } ) (solidline) and p ( I OFF | ˆ µ , ˆ σ ) (dashed) on top of the I ON and I OFF histograms (with appropriate normalizations);here { ˆ w j , ˆ µ j , ˆ σ j } are the a posteriori joint maxima of theGaussian mixture parameters. Figure 3 shows the pos-terior densities of the Gaussian means and variances.The data appear to be ﬁt well, with the null compo-nent ending up close to the oﬀ-pulse ﬁt results; we ob-serve no signiﬁcant pathologies in the MCMC chains. Toevaluate goodness of ﬁt quantitatively, we perform theKolmogorov–Smirnov test (KS test; Chakravarti et al.1967), and ﬁnd a statistic value of 0.015, correspondingto a p -value of 0.7—no evidence to reject the hypothesisthat the data were sampled from the best-ﬁt distribu-tion.For this pulsar we ﬁnd that the emitting and the nullcomponents are suﬃciently separate that all of the al-gorithms outlined above would give similar results. For Kaplan et al. − −

10 0 10 20 30 40 50 60

Pulse Intensity N u m b e r o f P u l s e s ComponentsScaled Null ComponentSumONOFFON-OFF*NF

Figure 2.

Distribution of pulse intensities for PSR J0323 +6742. The blue and orange histograms are the raw intensi-ties for on- and oﬀ-pulse windows, respectively. The dashedcurves are the maximum a posteriori individual componentsfrom the Gaussian mixture model for M = 2, with the solidcurve their sum, as determined by our MCMC algorithm.The dotted curve is the component for the nulls scaled by1 / NF: it matches the oﬀ-pulse intensities well. Finally, thedashed green histogram is the data that would be used to im-plement the Ritchings (1976) algorithm, although it is plot-ted for our best-ﬁt value of NF. Using our MCMC algorithmwe ﬁnd ˆNF = 50 ± instance, we ﬁnd that only about 2% of the pulses fromthe emitting component would have intensities less than0, which would only bias the Ritchings (1976) results bya small amount.If the problem is well behaved, we can select theoptimal number of components M by maximizing theBayesian information criterion (BIC; Schwarz 1978) orthe Akaike information criterion (AIC; Akaike 1974).We illustrate this in Figure 4, which shows both. Weﬁnd strong evidence that nulling behavior is present( M > M = 2 comparedto M >

2. The BIC corresponds to approximating theBayes ratios between models as e − ∆ BIC / , where ∆ BIC isthe diﬀerence in BIC (likewise for AIC). In this approx-imation, the implied Bayes ratio for M = 2 vs. M = 3 is O = 175 (AIC) or O = 92 ,

000 (BIC), showing that M = 2 is indeed preferred.We show another example in Figure 5, where for M = 2 we ﬁnd on-pulse maximum a posteriori parame-ters ˆ µ = 12 . σ = 11 .

3, with ˆNF = 41%. In thiscase there is much less separation in intensities betweennulls and pulses, with about 13% of the pulses having

I <

0. Nonetheless our method gives a robust ﬁt. Fig-ure 5 appears to show slight deviations from Gaussian . . . . . µ E m i t . . . . . σ N u ll . . . . σ E m i t − . . . . µ Null . . . . N F . . . . . µ Emit . . . . . σ Null . . . . σ Emit .

32 0 .

40 0 .

48 0 . NF Figure 3.

Posterior probability densities for the Gaussian-distribution parameters of the null and emitting components,as derived in our MCMC algorithm. The solid lines show the µ and σ modes as inferred from the oﬀ-pulse data. Thecontours are 1-, 2-, and 3- σ joint conﬁdence contours. Number of components M I n f o r m a t i o n c r i t e r i o n − , AICBIC

Figure 4.

Model-comparison evidence for diﬀerent numberof Gaussian-mixture components in the PSR J0323 + 6742data. We show the Bayesian information criterion (BIC; or-ange squares) and the Akaike information criterion (AIC;blue circles) as a function of the number of components M .The non-nulling hypothesis ( M = 1) is rejected at high con-ﬁdence, and M = 2 is preferred by both criteria. aussian Mixtures for Nulling Pulsars − −

10 0 10 20 30 40 50 60

Pulse Intensity N u m b e r o f P u l s e s ComponentsScaled Null ComponentSumONOFFON-OFF*NF

Figure 5.

Distribution of pulse intensities for PSR J0054 +6946 (see Fig. 2 for details). Here the nulling and emittingcomponents are much less separated than in Figure 3. Usingour MCMC algorithm we ﬁnd ˆNF = 40 . ± . distributions, which may be handled with more com-plex models that (for example) incorporate asymmetricdistributions due to scintillation. Indeed, the KS test re-jects the assumption that the data were drawn from thebest-ﬁt distribution at the 5 × − level; nevertheless,since our goal is primarily to quantify the bimodality ofthe emitted pulses rather that the exact intensity distri-bution, we believe the nulling results themselves to berobust.Finally, in Figure 6 we show an example where thepulsar shows no obvious nulling. Our algorithm ﬁndsˆNF = 0 . ± . P value of 2 × − ), but the overallrobustness of our determination is evident. In contrastto Ritchings (1976) which can give determinations ofnon-zero nulling fractions even for pulsars that do notappear to null, our algorithm behaves well. Therefore wecan use it for all pulsars that are suﬃciently bright re-gardless of whether nulling is evident, and derive morerobust determinations of whether or not weak nullingbehavior is present.3.3. Simulation Results

We validate our algorithm by simulating pulsar datafor a range of parameters, drawing random intensitiesaccording to Eqs. (2) and (1) for the oﬀ- and on-pulsewindows, respectively. We base our synthetic datasetson the PSR J0323 + 6742 data analyzed above: we sim-ulate 2,000 pulses with µ = 0, σ = 5, σ = 10, and − −

10 0 10 20 30 40 50 60

Pulse Intensity N u m b e r o f P u l s e s ComponentsScaled Null ComponentSumONOFFON-OFF*NF

Figure 6.

Distribution of pulse intensities for PSR J0137 +6349 (see Fig. 2 for details). Here there is no obvious evidencefor nulling. Using our MCMC algorithm we ﬁnd ˆNF = 0 . ± . nulling fraction NF = 0 .

5. We vary µ between 5 (hardto distinguish from the nulls) and 30 (easily distinguish-able), and we repeat the test for 30 trials for each valueof µ .We plot the median and standard deviation of theestimated NF in Figure 7, using the Ritchings (1976)method, the expectation-maximization method, and ourBayesian algorithm (in which case we report the max-imum a posteriori NF). All three algorithms agree forhigh pulse intensities, µ (cid:38) µ , as expected. Thisis because the model has a signiﬁcant fraction of non-nulled pulses with intensities less than 0, ranging from30% (for µ = 5) to 0.1% (for µ = 30). Speciﬁcally, weexpect a fraction 12 −

12 erf (cid:32) µ √ σ (cid:33) (3)of the emitted pulses to have intensities <

0, whereerf( x ) is the error function of x . This then leads toa biased estimated NF,NF + (1 − NF) (cid:34) − erf (cid:32) µ √ σ (cid:33)(cid:35) (4)which we have plotted in Figure 7, where they agreewith our simulated results. The EM results using GaussianMixture are also biased at low pulse intensi-ties. By contrast, our Bayesian algorithm performs well,with consistent uncertainties and no obvious bias acrossthe µ range. Kaplan et al.

Mean of pulses µ Emit . . . . . . N u lli n g f r a c t i o n MCMCEMRichtings

Figure 7.

Comparison on NF estimates for simulated data,as derived using the Ritchings (1976) algorithm (green di-amonds), our expectation-maximization algorithm (orangesquares, labeled as “EM”), and our Bayesian algorithm ﬁt(blue circles). The horizontal line marks the true nullingfraction of 0 .

5. Each simulated dataset consisted of 2,000pulses, with µ = 0, σ = 5, σ = 10; simulations were re-peated 30 times for each value of µ (hence the vertical errorbars). The solid green curve shows our analytical expecta-tion for the bias of the Ritchings (1976) algorithm (see maintext). 4. DISCUSSION AND CONCLUSIONSWe have outlined and demonstrated an improvedmethod to determine the nulling fraction of a pulsar.The method performs well in the limit of weak nulling,so it can be applied to a large number of pulsars withoutevident strong nulls. Unlike the traditional Ritchings(1976) algorithm, our method is unbiased, and it can beapplied to pulsars with more than two emission modes,as long as those are reﬂected in the pulse intensities.However, it does require speciﬁcation of the functionalform of the intensity distributions for the nulling andemitting components: here we assume sums of Gaus-sians, although exponentials appropriate for 100% mod-ulation by interstellar scintillation (e.g., Rickett 1990),or intermediate distributions are also possible. In thosecases the AIC/BIC values can be used to quantitativelycompare how well alternative distributions ﬁt the data. An additional beneﬁt to this analysis is that we candetermine explicitly the probability that any individualpulse belongs to a given class. This is sometimes calledthe “responsibility” (Hastie et al. 2009), and is given by: p ( j | I ON k ) = w j N ( µ j , σ j ) (cid:80) Mj (cid:48) =1 w j (cid:48) N ( µ j (cid:48) , σ j (cid:48) ) (5)for class j . An example of this is shown in Figure 1,where we can determine the nulling probability as p ( j =1 | I ON k ). This probability can be computed for the maxi-mum a posteriori { ˆ w j , ˆ µ j , ˆ σ j } , or it can be marginalizedover their distributions. Individual-pulse nulling prob-abilities can be used in robust multi-wavelength stud-ies, to establish whether the X-ray properties of thepulses received during nulls diﬀer from the others (e.g.,Hermsen et al. 2013). We can also look for temporal pat-terns in the nulling properties, like the length of nulls(Fig. 1) or the time between nulls (e.g., Wang et al. 2007)using quantitative probability thresholds, and we couldexamine the probability that adjacent pulses transitionbetween nulling and emitting behavior. Finally, we canﬁt for more than two components and identify modechanging quantitatively, in addition to nulling. All ofthese topics will be explored in future papers. Acknowledgments.

We thank S. McSweeney for help-ful comments. We thank an anonymous referee and theAAS journals statistics editor for their suggestions. TheGreen Bank Observatory is a facility of the National Sci-ence Foundation operated under cooperative agreementby Associated Universities, Inc. Support was providedby the NANOGrav NSF Physics Frontiers Center awardnumber 1430284. MV acknowledges support from theJPL RTD program. Portions of this research were car-ried out at the Jet Propulsion Laboratory, CaliforniaInstitute of Technology, under a contract with the Na-tional Aeronautics and Space Administration.

Facility:

GBT

Software:

Astropy(AstropyCollaborationetal.2013),DSPSR (http://dspsr.sourceforge.net/index.shtml), em-cee(Foreman-Mackeyetal.2013),PSRCHIVE(Hotanetal.2004), scikit-learn (Pedregosa et al. 2011) .REFERENCES

Ahn, S., & Fessler, J. A. 2003, Standard errors of mean,variance, and standard deviation estimators, Tech. Rep.Technical Report 413, Communications and SignalProcessing Lab., Univ. of Michigan Akaike, H. 1974, IEEE Transactions on Automatic Control,19, 716Arjunwadkar, M., Rajwade, K., & Gupta, Y. 2014, inAstronomical Society of India Conference Series, Vol. 13,79–81 aussian Mixtures for Nulling Pulsars7