[PDF] Classifying sleep states using persistent homology and Markov chain: a Pilot Study

Abstract

Obstructive sleep Apnea (OSA) is a form of sleep disordered breathing characterized by frequent episodes of upper airway collapse during sleep. Pediatric OSA occurs in 1-5% of children and can related to other serious health conditions such as high blood pressure, behavioral issues, or altered growth. OSA is often diagnosed by studying the patient's sleep cycle, the pattern with which they progress through various sleep states such as wakefulness, rapid eye-movement, and non-rapid eye-movement. The sleep state data is obtained using an overnight polysomnography test that the patient undergoes at a hospital or sleep clinic, where a technician manually labels each 30 second time interval, also called an "epoch", with the current sleep state. This process is laborious and prone to human error. We seek an automatic method of classifying the sleep state, as well as a method to analyze the sleep cycles. This article is a pilot study in sleep state classification using two approaches: first, we'll use methods from the field of topological data analysis to classify the sleep state and second, we'll model sleep states as a Markov chain and visually analyze the sleep patterns. In the future, we will continue to build on this work to improve our methods.

Full PDF

CClassifying sleep states using persistenthomology and Markov chain: a Pilot Study

Sarah Tymochko, Kritika Singhal, and Giseon Heo

Abstract

Obstructive sleep Apnea (OSA) is a form of sleep disordered breathingcharacterized by frequent episodes of upper airway collapse during sleep. Pedi-atric OSA occurs in 1-5% of children and can related to other serious health con-ditions such as high blood pressure, behavioral issues, or altered growth. OSA isoften diagnosed by studying the patient’s sleep cycle, the pattern with which theyprogress through various sleep states such as wakefulness, rapid eye-movement,and non-rapid eye-movement. The sleep state data is obtained using an overnightpolysomnography test that the patient undergoes at a hospital or sleep clinic, wherea technician manually labels each 30 second time interval, also called an “epoch,”with the current sleep state. This process is laborious and prone to human error. Weseek an automatic method of classifying the sleep state, as well as a method to an-alyze the sleep cycles. This article is a pilot study in sleep state classiﬁcation usingtwo approaches: ﬁrst, we’ll use methods from the ﬁeld of topological data analysisto classify the sleep state and second, we’ll model sleep states as a Markov chainand visually analyze the sleep patterns. In the future, we will continue to build onthis work to improve our methods.

Sarah TymochkoDept. of Computational Mathematics, Science and Engineering, Michigan State University, EastLansing, MI, e-mail: [email protected]

Kritika SinghalDept. of Mathematics, The Ohio State University, Columbus, OH, e-mail: [email protected]

Giseon Heo (corresponding author)School of Dentistry; Department of Mathematical and Statistical Sciences, University of Alberta,Edmonton AB T6G 1C9, Canada. e-mail: [email protected] a r X i v : . [ q - b i o . Q M ] F e b Sarah Tymochko, Kritika Singhal, and Giseon Heo

Obstructive sleep apnea (OSA) is a chronic condition characterized by frequentepisodes of upper airway collapse during sleep. Pediatric OSA is a serious healthproblem; even mild forms of untreated pediatric OSA may cause high blood pres-sure, changes to the heart, behavioral challenges, or alter the patients growth. Theprevalence of childhood OSA is in the range of 1-5%. The gold standard for di-agnosis of pediatric OSA is by overnight polysomnography (PSG) in a hospital orsleep clinic. PSG provides multi-channel time series, such as EEG, ECG, EOG,EMG, airﬂow, and SpO2. In order to determine the sleep state, a sleep technicianmust manually assign the sleep state for each epoch (a 30 second interval) based onthe PSG multi-channel time series. This is a laborious process which is subject tohuman error.Following this labeling process, a sleep specialist determines the severity of OSAby assigning a apnea hypopnea index ( ahi ) based on the PSG time series and sleeppatterns of a patient. The severity of OSA in children is categorised as none (ahi < mild (1 ≤ ahi ≤ moderate ( 5 < ahi ≤ severe (ahi > < > hypnogram . The hypnograms of two patients, CF046and CF050 are depicted in Figure 1.As labeling sleep state based on PSG data is currently done manually, it would beuseful to come up with automated sleep scoring mechanism. This would optimizethe effort and time of the technician, as well as remove the human error from thelabeling process. In this article, analyze time series variables from the PSG datausing two separate approaches. The ﬁrst approach, discussed in section 2, using apersistent homology based method to predict the sleep state for 30 second interval lassifying sleep states using persistent homology and Markov chain: a Pilot Study 3CF011 CF030 CF031 CF046 CF050 CF055 CF076 CF079Wake 24.47 7.81 5.31 13.58 25.67 15.18 17.00 24.53REM 14.16 19.22 17.93 11.22 10.48 17.24 13.70 8.41NREM1 2.37 1.30 2.47 8.76 3.64 4.41 5.10 7.32NREM2 40.27 41.58 35.50 32.43 32.73 41.92 41.60 40.26NREM3 18.72 30.09 38.79 34.01 27.49 21.25 22.60 19.49 ahi Table 1: Percentage of each sleep stage, ahi , age at the time of PSG was taken, total

The utilization of tools from topological data analysis (TDA) for time series analy-sis has recently become a popular line of research. The combination of these ﬁeldshas resulted in new methods for quantifying periodicity and distinguishing behaviorin time series data. PSG time series variables display different patterns for differentsleep states. One proposed method of detecting these differences is using persistenthomology, a well studied tool for quantifying the underlying structure of data. Per-sistent homology has been used to study time series data from many applications.Existing applications include studying machining dynamics [11, 12, 13], gene ex-pression [21, 2], ﬁnancial data [15], and video data [28, 26]. Additionally, [5] used

Sarah Tymochko, Kritika Singhal, and Giseon Heo topological methods for time series analysis of sleep-wake states. Our goal is to ex-plore the use of persistent homology to classify sleep states in PSG data. We alsohope to see some variation in classiﬁcation results that relate to the OSA severity ofthe patient.As described in Sec. 1, we have 8 patients with varying severity of OSA. For eachpatient, we will consider three channels from the polysomnography data, speciﬁ-cally the central electrode on the sclape (C3), left eye movement (LEOG), and righteye movement (REOG). Each of these time series must be normalized by subtract-ing a reference electrode, M2 which placed on the mastoid. See Figure 2.Fig. 2: (Left) The International 10/20 System of Electrode Placement. Image fromsleeptechstudy.wordpress.com. (Right) EOG and EMG electrode placement. Imagefrom [22]

In this section, we’ll present some necessary background information on persistenthomology, time series analysis, and machine learning. Section 2.1.1 will cover thenecessary background material, however for a more detailed overview, we refer theinterested reader to [6, 9, 16, 18].

A standard tool from algebraic topology is homology, which studies topologicalfeatures such as connected components, loops and voids. Given a space X , homol-ogy computes a group, H p ( X ) , for each dimension p = , , , . . . . Each dimensioncontains information about the topological structure; speciﬁcally dimension 0 corre-sponds to connected components, dimension 1 corresponds to loops, and dimension2 corresponds to voids. Here we’ll focus on simplicial homology, but we must ﬁrstdeﬁne a few other concepts.An n -simplex is deﬁned as the convex hull of n + lassifying sleep states using persistent homology and Markov chain: a Pilot Study 5 triangle. The face of an n -simplex, σ , is deﬁned as the convex hull of a nonemptysubset of the vertices in σ . A simplicial complex K is a space built from simplicesthat satisﬁes two properties: ﬁrst, the intersection of any two simplices in K mustalso be a simplex in K and second, all faces of a simplex in K must also be asimplex in K .For a simplicial complex K , a p -chain, c , is a sum of p -simplices, σ i in K , withsome coefﬁcients a i , c = ∑ a i σ i . In this case, we’re focusing on the simpliﬁed caseof a i ∈ Z as this is typically used for persistent homology. Here the collection of p -chains, or the chain group, denoted as C p ( K ) , forms a vector space. The boundarymap is a linear transformation between chain groups ∂ p : C p → C p − that maps a p -simplex to the sum of it’s ( p − ) -dimensional faces. The sequence of boundarymaps between chain groups · · · ∂ p + −−→ C p ∂ p −→ C p − ∂ p − −−→ C p − ∂ p − −−→ · · · Within chain groups, we deﬁne a p -cycle c ∈ C p with empty boundary, ∂ p ( c ) = p -cycles is the kernel of the boundary map, ker ( ∂ p ) . A p -boundaryis a p -chain c p ∈ C p , that is the boundary of a p + c p + ∈ C p + , c p = ∂ p + ( c p + ) . The set of p -cycles is the image of the boundary map im ( ∂ p ) . Then the p -th homology group is deﬁned as H p ( K ) = ker ( ∂ p ) / im ( ∂ p + ) . Persistent homology is a method of studying the homology of a space acrossdifferent scales. In this case, we will use the Vietoris-Rips complex to create a sim-plicial complex out of the point cloud. The Vietoris-Rips complex is deﬁned for apoint cloud, X and a distance r , where for every ﬁnite set of k vertices with max-imum pairwise distance at most r , the ( k − ) -simplex formed by those vertices isadded to the complex. Then taking a range of distance values, { r i } we get a set ofsimplicial complexes { X r i } where if r i ≤ r j , then X r i ⊆ X r j . Thus, taking an increas-ing sequence of distance values, 0 ≤ r ≤ r ≤ · · · ≤ r n results in a nested sequenceof simplicial complexes, X r ⊆ X r ⊆ · · · ⊆ X r n (1)called a ﬁltration. The inclusions induce linear maps between the homology groupsof each simplicial complex, H p ( X r ) → H p ( X r ) → · · · → H p ( X r n ) . (2)These maps allow us to track how the homology changes through the ﬁltration.A p -dimensional feature is “born” at the distance value corresponding to theﬁrst time in the sequence of homology groups that we see that feature appear. Moreformally, γ is born at r i if γ ∈ H p ( X r i ) but γ (cid:54)∈ H p ( X r i − ) . We say that a feature “dies”if it merges with an older feature. Speciﬁcally, a feature γ dies at r j if it merges witha feature that has earlier birth time between X r j − and X j . A persistence diagram is Sarah Tymochko, Kritika Singhal, and Giseon Heo

Fig. 3: Example time series (left), delay embedding with d = τ = r i and dies at r j is represented by the point ( r i , r j ) .Persistence diagrams provide concise and robust summaries of the topologicalfeatures on various scales. However, these diagrams are not well suited for machinelearning tasks. There are many methods, often called featurization methods, of con-verting the information in a persistence diagram into a vector. Once the diagram hasbeen converted to a vector, it can be used in standard machine learning frameworks.In this paper, we’ll use one featurization method called persistence images [1]. Wewill not go over the details of the method here, however an example can be seen inFig. 4.For implementation, we use scikit-tda python package [23] to compute persistenthomology with ripser [27] and to transform the persistence diagrams into persis-tence images with the Persim library. For both of these computations, we use thedefault parameters. Now that we have established a framework for persistent homology, we need toconvert time series data into a form that is amenable to this type of analysis. Thereare several existing methods to convert a time series into a point cloud. One popularmethod, called a delay embedding, leverages Takens’ theorem [25]. Given a timeseries, X ( t ) , we select two parameters, a dimension d ∈ Z > and a delay, τ >

0. Thedelay embedding is then deﬁned as φ τ , d : X ( t ) (cid:55)→ ( X ( t ) , X ( t + τ ) , X ( t + τ ) , . . . , X ( t + ( d − ) τ ) Takens’ theorem proves that with the right parameters, this is in fact an embed-ding in the true mathematical sense as it preserves the underlying structure of themanifold. This embedding is sensitive to the choices of d and τ . To choose theseparameters automatically, we use a method based on permutation entropy, as pre-sented in [17]. Once the time series has been embedded as a point cloud with this lassifying sleep states using persistent homology and Markov chain: a Pilot Study 7 Fig. 4: Example of a point cloud with corresponding persistence diagram andpersistence image. The persistence diagram must ﬁrst be transformed into birth-lifetime coordinates, meaning a point ( b , d ) in the original diagram, is now plottedas ( b , d − b ) in the birth-lifetime diagram. The persistence images method convertsthe birth-lifetime diagram into an image, represented as a 20 ×

20 matrix. That ma-trix can be ﬂattened into a 1 ×

400 vector that can then be used for machine learning.

Table 2: Possible labels for 2, 3, 4 and 5 class classiﬁcations.method, standard persistent homology can be applied. Figure 3 has an example ofthe delay embedding method along with the corresponding persistence diagram.

To apply the methods described in Sec. 2.1, for each patient, we break the time se-ries into non-overlapping 30 second intervals corresponding to the labeled epochs.Then for a given time series, we embed each 30 second interval separately using theTakens’ embedding method and compute 1-dimensional persistence on the resultingpoint cloud. For each persistence diagram we create a feature vector using persis-tence images. Note we’re only using 1-dimensional persistence diagrams because atime series with any periodic behavior will create a circular shape in the embeddedpoint cloud. An example of this transformation from persistence diagram to featurevector can be seen in Fig. 4. We will test each of the three PSG channels for clas-siﬁcation separately, in addition to testing all of them combined. We can test all ofthe channels together by computing their feature vectors with persistence images,and then concatenating the corresponding vectors for each epoch. Note that for ouranalysis, we keep the data separated by patient in order to compare based on ahi index as well.

Sarah Tymochko, Kritika Singhal, and Giseon Heo

Fig. 5: Average classiﬁcation accuracies using persistence images for 2, 3, 4 and5 classes for each patient. Error bars are the standard deviation. Classiﬁers usedare gradient boosting (GB), random forest (RF), ridge classiﬁcation (RC), supportvector classiﬁer (SV), K-neighbors classiﬁer (KN) and decision tree (DT).For classiﬁcation, we will test 2, 3, 4 and 5 class classiﬁcation. The possible la-bels for each task are listed in Table 2. To perform the classiﬁcation, we use thepython package scikit-learn [19] to use six supervised machine learning algorithms:gradient boosting [7], random forest [3], ridge regression [10], support vector ma-chines [24], k-nearest neighbors classiﬁer [8], and decision tree [4]. For all of theseclassiﬁers, we use the default parameters. For each experiment, we reserve 33% fortesting data and use the remaining for training. We also run each classiﬁcation task10 times and average the accuracies across all 10 runs. The average and standard lassifying sleep states using persistent homology and Markov chain: a Pilot Study 9

Fig. 6: Average classiﬁcation accuracies using persistence images for 2, 3, 4 and 5classes for each patient using all three signals. Error bars are the standard deviation.Classiﬁers used are gradient boosting (GB), random forest (RF), ridge classiﬁcation(RC), support vector classiﬁer (SV), K-neighbors classiﬁer (KN) and decision tree(DT).Fig. 7: Weighted average precision, recall and F1 scores for Random Forest clas-siﬁcation for each patient using 2, 3, 4 and 5 classes. Error bars are the standarddeviation.deviation of the accuracies for all classiﬁcation algorithms can be seen in Fig. 5 andFig. 6.From these results, we note that in Fig. 5, each of the signals provide similarresults for each patient. Thus one signal does not seem to contain more informationabout the sleep state than the others. This is further emphasized by the fact thatthe results in Fig. 6 for all signals combined do not improve upon results fromlooking at a single signal. For all patients, all classiﬁers the training accuracies vary,with random forests performing above 90% for all patients, however the testingaccuracies are about the same for a given number of classes. It is expected thatthe performance of the classiﬁers is worse for more classes. It appears that the 4and 5 class classiﬁcations achieve relatively similar accuracies, while there is a bigimprovement reducing it to 3 classes.Looking across patients, there are variations. In general, for patient CF030, theclassiﬁcation accuracies are generally higher than for other patients for all numbers of classes. Patient CF079 seems to have the worst classiﬁcation accuracies. How-ever, these variations in classiﬁcation accuracies do not seem to correlate with the ahi index.Existing methods compared in [20] almost all achieve over 90% accuracies.However none of the methods mentioned use topological features. In direct compar-ison, the authors in [5] use similar persistent homology based approaches, howeverthey use a different featurization method and they test their method across manydatabases. They perform three different classiﬁcation tasks: (1) sleep vs. wake, (2)REM vs. NREM, and (3) wake vs. REM vs. NREM. For the ﬁrst task, they reportmean accuracies ranging from approximately 62-75% accuracy with across datasets.Their lowest reported accuracy, 62 . ± . = T PT P + FP , Recall = T PT P + FN , F1 = · T P · T P + FP + FN Each score is calculated for each class, then a weighted average is taken based onthe number of samples with that class label. Note that if any class has no predictedsamples, the score for that class is set to 0. These calculations are done with thescikit-learn implementations. We then take that weighted average for the 10 runsand take the average and standard deviation of those 10 values.

In this section, our objective is to understand and visualize sleep patterns of the eightpatients in our data set. As we see in Figure 1, a hypnogram records the sleep state ateach epoch over the entire sleep duration of a patient. In our data, since the patientssleep for different time durations, we have different number of total epochs for eachpatient. However, we want to compare the hypnograms epoch by epoch. Therefore,we normalize them by truncating at the minimum number of epochs of a patient.We can make some observations just by looking at the hypnograms, one being thatthe sleep pattern of patient CF046 looks very different from the others. Similarly,we observe that the initial sleep stages for patient CF050 are different from therest of the patients, see Figure 13 in the appendix, where the hypnograms of theremaining patients are depicted. These observations therefore lead to the conjecture lassifying sleep states using persistent homology and Markov chain: a Pilot Study 11 that there is a relation between the sleep stage data and the severity of OSA in apatient. Therefore, we plan to use various statistical tools to study the sleep stagedata and based on our results, determine the pairs of patients with same severity ofOSA. This observation was made by one of authors, KS, who was blinded to theinformation of patients’ OSA severity.We consider three measures to inspect sleep patterns, namely transition proba-bilities, Cohen’s κ , and Kullback-Leibler divergence. To do so, we model sleep as adiscrete Markov chain with 5 states. Let ( X t ) N denote the stationary categorical pro-cess with state space S = { , , , , } , corresponding to the sleep states Wake (W),

REM (R),

NREM1 (N1),

NREM2 (N2),

NREM3 (N3), and π = ( π i i , . . . , π i ) , denotethe marginal distribution for each patient i . Table 3 shows the marginal distributionfor the eight patients.

Patient Marginal Distribution of (W, R, N1, N2, N3) Order of frequenciesCF011 ( . , . , . , . , . ) N > W > N > R > N ( . , . , . , . , . ) N > N > R > W > N ( . , . , . , . , . ) N > N > R > W > N ( . , . , . , . , . ) N > N > W > R > N ( . , . , . , . , . N > W > N > R > N ( . , . , . , . , . ) N > N > R > W > N ( . , . , . , . , . ) N > N > W > R > N ( . , . , . , . , . ) N > W > N > R > N Table 3: Marginal Distributions π of all 8 patientsFor time-homogeneous Markov chain X t , t ∈ N , the transition probability is de-ﬁned as P ( X t = j | X t − = i ) : = p j | i for for any i , j ∈ S and all t ∈ N . We note thatthe sum of probabilities in each row equals to one, i.e. ∑ j ∈ S p j | i =

1. The transi-tion probability matrix and a visual representation of the 5-state Markov chain of apatient is depicted in Figures 8 and 9. 

W R N1 N2 N3W 0 .

85 0 .

004 0 .

104 0 .

04 0R 0 .

082 0 .

882 0 .

035 0 0N1 0 .

094 0 .

54 0 .

27 0N2 0 .

048 0 .

004 0 .

012 0 .

925 0 . .

015 0 0 0 .

005 0 .  Fig. 8: Transition probability ofCF079.

WakeREM NREM1 NREM2 NREM30.850.004 0.104 0.040.0820.882 0.0350.0940.0940.54 0.27 0.0480.004 0.012 0.9250.0090.015 0.005 0.979

Fig. 9: Visual representation of Markovchain (CF079).To analyse the serial dependence structure of sleep state, we calculate Cohen’s κ at each lag k as recommended by [29]. Cohen’s κ is analogous to auto-correlationfunction in real-valued process (continuous time series). It is deﬁned as κ ( k ) = ∑ j ∈ S ( P j j ( k ) − π j ) − ∑ j ∈ S π j , where P i j ( k ) = P ( X t = i , X t − k = j ) . The range of Cohen’s κ is (cid:104) − ∑ j π j / ( − ∑ j π j ) , (cid:105) .A positive (negative) value of κ ( k ) means positive (negative) serial dependence,and a value of 0 means serial independence at lag k . The Cohen’s κ ( k ) plots for k = , . . . , − . . . . . . . k C ohen ' s k ( k ) WakeREM NREM1 NREM2 NREM30.710.18 0.090.96 0.02 0.010.120.020.66 0.2 0.080.90.01 0.99 − . . . . . . . k C ohen ' s k ( k ) WakeREM NREM1 NREM2 NREM30.930.060.98 0.020.110.41 0.47 0.040.01 0.930.010.0040.004 0.008 0.98

Fig. 10: (Left) Cohen’s κ plots and representation of transition probabilities forpatients CF046 (left) and CF050 (right).We observe that Cohen’s κ ( k ) of patient CF050 is positive and gradually de-creases until k =

90, and then decreases to − . k = , see Figure 10. Thiscan be interpreted as each state tends to be followed by itself (but correlation de-creases gradually), and then tends to ﬂow from state to state. The Cohen’s κ ( k ) ofCF046 is positive at every lag k = , . . . , , which means that the patient’s sleepstate is followed by itself. However, hypnogram of CF046 indicates that the patientstays at NREM 3 for long period of time during ﬁrst, second and the last sleep cy-cles, while there is frequent ﬂow between sleep states during the middle sleep cycle.Although, hypnogram and κ ( k ) look different for CF046 and CF050, in terms ofOSA severity, they are somewhat similar in that CF046 is moderate, and CF050 issevere.We now compare transition probability matrices P , . . . , P of the eight patientsby calculating the Kullback-Leibler (KL) [14] divergence between probability dis-tributions in each corresponding column in P and P (cid:48) . As an example, KL betweenﬁrst row in each P and P (cid:48) is D kl ( P (cid:107) P (cid:48) ) = ∑ i p i | ln ( p i | / p (cid:48) i | ) . We now deﬁneKullback-Leibler divergence between P and P (cid:48) asD kl ( P (cid:107) P (cid:48) ) = ∑ i ∑ j p j | i ln p j | i p (cid:48) j | i . The KL divergence is asymmetric, the reason being that KL ( P (cid:107) P (cid:48) ) (cid:54) = KL ( P (cid:48) (cid:107) P ) . We present sum of the two, i.e. KL ( P (cid:107) P (cid:48) ) + KL ( P (cid:48) (cid:107) P ) , in Figure 11.Ideally, KL divergence would be higher between no and severe patients. As wenotice from the table in Figure 11, there is no clear pattern. The patient CF050 with lassifying sleep states using persistent homology and Markov chain: a Pilot Study 13  CF079 CF076 CF030 CF011 CF031 CF046 CF055 CF050CF079 0 1 .

41 1 .

59 1 .

84 0 .

91 1 .

02 1 . . CF076 0 1 .

50 0 .

82 1 .

14 0 .

77 0 .

61 0 . . .

28 1 .

27 0 .

56 1 . . . .

23 1 . .

34 0 .

85 1 . .

57 1 . .  Fig. 11: Kullback-Leibler divergence between 8 patients. OSA severity of CF079 &CF076: no, CF030 & CF011: mild, CF031 & CF046: moderate, CF055 & CF050:severe.highest ahi has high divergence from the patient CF079 with the lowest ahi and it issimilar to the patient, CF055 (high ahi ) as well as CF076 (low ahi ). In conclusion,transition probabilities, Cohen’s κ , and KL-divergence were not able to distinguishpatients with low ahi from those with high ahi .In general, sleep patterns change through the night. At the beginning of sleep, amajority of people are in states mostly N2 and N3, with sporadic periods of N1 andshort R periods. As the night progresses, the period of N3 becomes shorter, whileN1 and N2 remain similar with longer R period. Both CF046 and CF050 don’t fol-low the typical sleep patterns during sleep. The patient CF046 starts with a goodsleep with a long period of N3 but after sleep cycle 2, frequent wake ups unable thepatient to hit deep sleep until the morning. The pattern of CF050 is somewhat op-posite to that of CF046; ﬁrst frequent wake ups without falling in deep sleep duringthe early cycles, and then hit N3 during the last sleep cycle. Neither of these twopatterns are considered as ideal sleep. As we might expect, the patterns of transitionprobabilities are different in sleep cycles of patients. See for example, the transi-tion probabilities in each sleep cycle for two patients in Figures 15 and 16 in theappendix. We therefore observe that it would be more useful to incorporate sleepcycles while calculating Cohen’s κ , transition probabilities, and KL divergences. In this paper, we ﬁrst use a persistent homology approach for time series analysis toclassify sleep states from three different PSG channels. Our classiﬁcation accuraciesrange drastically based on the number of classes used. Additionally, the F1 scoresreveal the underlying issue of class imbalance, which should be taken into accountwhen considering our accuracies. In the future, we would like to further work onthis class imbalance issue, possibly by subsampling the data to get relatively equaldistribution of classes in order to determine if this method using persistent homology is worth pursuing further. We’d also like to test to see if using other featurizationsof persistence diagrams yield different results.We modeled sleep stages as 5-state discrete ﬁrst-order Markov chain and try todiscern sleep patterns of eight OSA patients with different OSA severity using tran-sition probabilities, Cohen’s κ , and Kullback-Leibler divergence. These three mea-sures were not able to discern patients with low ahi from those with high ahi . Thiswas our naive attempt to understand any relationship between pattern of sleep stagesand severity of OSA. Our next research goal is to analyse our data incorporating thesleep cycles. Furthermore, we plan to consider higher order Markov chain. Acknowledgements

This work was started at the second Women in Data Science and Mathemat-ics workshop (WiSDM 2) in summer 2019, at The Institute for Computational and ExperimentalResearch in Mathematics (ICERM), Brown University. We thank ICERM for the support. The au-thors would like to thank their fellow group members, Brenda Praggastis, Melissa Stockman, KaisaTaipale, Marilyn Vazquez, Sunny Wang, and Emily Winn. We’d especially like to thank BrendaPraggastis for her help preprocessing the time series data and setting up some of the code for thepersistent homology analysis. We also thank Mathieu Chalifour for discussion of polysomnogra-phy time series. ST was partially funded by NSF grants DMS1800446 and CMMI-1800466. KSwas partially funded by NSF grant DMS 1547357. GH would like to thank the National Sciencesand Engineering Research Council of Canada (NSERC DG 2016-05167), Seed grant from Womenand Children’s Health Research Institute, Biomedical Research Award from American Associationof Orthodontists Foundation, and the McIntyre Memorial fund from the School of Dentistry at theUniversity of Alberta.lassifying sleep states using persistent homology and Markov chain: a Pilot Study 15

This section contains ﬁgures referred to in the main article. − . . . . . . . k C ohen ' s k ( k ) − . . . . . . . k C ohen ' s k ( k ) − . . . . . . . k C ohen ' s k ( k ) − . . . . . . . k C ohen ' s k ( k ) − . . . . . . . k C ohen ' s k ( k ) − . . . . . . . k C ohen ' s k ( k ) − . . . . . . . k C ohen ' s k ( k ) − . . . . . . . k C ohen ' s k ( k ) Fig. 12: Cohen’s κ for patients, CF079 &CF076 (No OSA), CF030 & CF011 (MildOSA), CF031 & CF046 (Moderate OSA), and CF055 & CF050 (Severe OSA) (fromleft to right and top to bottom.) Fig. 13: Hypnograms of patients: CF079 &CF076 (No OSA), CF030 & CF011(Mild OSA), CF031 & CF046 (Moderate OSA), and CF055 & CF050 (Severe OSA)with respective sleep cycles marked in red. lassifying sleep states using persistent homology and Markov chain: a Pilot Study 17

WakeREM NREM1 NREM2 NREM30.850.004 0.104 0.040.0820.882 0.0350.0940.0940.54 0.27 0.0480.004 0.012 0.9250.0090.015 0.005 0.979 WakeREM NREM1 NREM2 NREM30.8940.07 0.0350.97 0.0290.0190.627 0.352 0.0320.009 0.007 0.9410.0090.013 0.004 0.982WakeREM NREM1 NREM2 NREM30.780.03 0.07 0.110.010.010.96 0.008 0.0160.30.3 0.4 0.020.01 0.950.010.01 0.01 0.97 WakeREM NREM1 NREM2 NREM30.920.02 0.03 0.020.050.94 0.010.61 0.38 0.020.01 0.96 0.010.015 0.005 0.98WakeREM NREM1 NREM2 NREM30.780.08 0.05 0.070.010.010.94 0.02 0.020.030.030.67 0.26 0.010.01 0.01 0.940.020.01 0.01 0.97 WakeREM NREM1 NREM2 NREM30.710.18 0.090.96 0.02 0.010.120.020.66 0.2 0.080.90.01 0.99WakeREM NREM1 NREM2 NREM30.8130.006 0.116 0.0640.0150.974 0.010.1330.577 0.288 0.0350.011 0.002 0.9370.0140.018 0.009 0.972 WakeREM NREM1 NREM2 NREM30.930.060.98 0.020.110.41 0.47 0.040.01 0.930.010.0040.004 0.008 0.98

Fig. 14: Transition probabilities for patients CF079, CF076, CF030, CF031, CF046,CF055, and CF050 increasing order of ahi scores (left to right, and top to bottom).

WakeNREM1 NREM2 NREM30.960.040.5 0.5 0.90.1 1.00 Wake NREM2 NREM31.00 1.000.008 0.991WakeREM NREM1 NREM20.5290.176 0.2940.935 0.032 0.0320.2220.1110.555 0.111 0.0970.013 0.888 WakeNREM1 NREM20.7270.151 0.1210.1760.705 0.117 0.0630.936WakeREM NREM1 NREM20.560.32 0.120.941 0.0580.1110.0270.722 0.138 0.1370.019 0.843 WakeNREM1 NREM20.7770.2 0.0220.090.59 0.318 0.1290.870WakeREM NREM1 NREM2 NREM30.50.333 0.1661.00 0.333 0.666 0.0640.032 0.870.0320.011 0.988 REM NREM1 NREM2 NREM30.979 0.02 1.00 0.9760.0230.027 0.972

Fig. 15: Transition probabilities for the 8 sleep cycles of CF046 (displayed in orderfrom left to right and top to bottom). lassifying sleep states using persistent homology and Markov chain: a Pilot Study 19

WakeNREM1 NREM2 NREM30.8660.1330.0760.461 0.461 0.0280.009 0.9430.0180.022 0.022 0.955 WakeNREM1 NREM20.9030.0960.0580.411 0.529 0.080.01 0.9WakeREM NREM1 NREM2 NREM30.50.50.986 0.0131.00 0.0130.013 0.9450.027 0.016 0.983 WakeREM NREM1 NREM2 NREM30.80.20.96 0.040.50.5 0.04 0.960.006 0.993

Fig. 16: Transition probabilities for the 4 sleep cycles of CF050 (displayed in orderfrom left to right and top to bottom).

References

1. Henry Adams, Tegan Emerson, Michael Kirby, Rachel Neville, Chris Peterson, Patrick Ship-man, Sofya Chepushtanova, Eric Hanson, Francis Motta, and Lori Ziegelmeier. Persistenceimages: A stable vector representation of persistent homology.

The Journal of Machine Learn-ing Research , 18(1):218–252, January 2017.2. Jesse Berwald and Marian Gidea. Critical transitions in a model of a genetic regulatory system.

Mathematical Biosciences & Engineering , 11(4):723–740, 2014.3. Leo Breiman. Random forests.

Machine learning , 45(1):5–32, 2001.4. Leo Breiman.

Classiﬁcation and regression trees . Routledge, 2017.5. Yu-Min Chung, Chuan-Shen Hu, Yu-Lun Lo, and Hau-Tieng Wu. A persistent homologyapproach to heart rate variability analysis with an application to sleep-wake classiﬁcation. arXiv preprint arXiv:1908.06856 , 2019.6. Herbert Edelsbrunner and John Harer.

Computational Topology: An Introduction . AmericanMathematical Society, 2010.7. Jerome H Friedman. Greedy function approximation: a gradient boosting machine.

Annals ofstatistics , pages 1189–1232, 2001.8. Jacob Goldberger, Geoffrey E Hinton, Sam T Roweis, and Russ R Salakhutdinov. Neigh-bourhood components analysis. In

Advances in neural information processing systems , pages513–520, 2005.9. Allen Hatcher.

Algebraic Topology . Cambridge University Press, 2002.10. Arthur E Hoerl and Robert W Kennard. Ridge regression: Biased estimation for nonorthogonalproblems.

Technometrics , 12(1):55–67, 1970.11. Firas A. Khasawneh and Elizabeth Munch. Chatter detection in turning using persistent ho-mology.

Mechanical Systems and Signal Processing , 70-71:527–541, 2016.12. Firas A. Khasawneh and Elizabeth Munch. Utilizing topological data analysis for studyingsignals of time-delay systems. In

Advances in Delays and Dynamics , pages 93–106. SpringerInternational Publishing, 2017.13. Firas A. Khasawneh, Elizabeth Munch, and Jose A. Perea. Chatter classiﬁcation in turningusing machine learning and topological data analysis. In Tamas Insperger, editor, , volume 51,pages 195–200, 2018.14. S. Kullback and R.A. Leibler. On Information and Sufﬁciency.

The Annals of MathematicalStatistics , 3 1951.15. Gidea M. Topological data analysis of critical transitions in ﬁnancial networks. In Puzis R.Shmueli E., Barzel B., editor, , Springer Proceedings in Complexity. Springer, Cham, 2017.16. Elizabeth Munch. A user’s guide to topological data analysis.

Journal of Learning Analytics ,4(2), 2017.17. Audun Myers and Firas Khasawneh. On the automatic parameter selection for permutationentropy. arXiv preprint arXiv:1905.06443 , 2019.18. Steve Y. Oudot.

Persistence Theory: From Quiver Representations to Data Analysis (Mathe-matical Surveys and Monographs) . American Mathematical Society, 2015.19. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel,P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher,M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python.

Journal of MachineLearning Research , 12:2825–2830, 2011.20. Musa Peker. A new approach for automatic sleep scoring: Combining taguchi based complex-valued neural network and complex wavelet transform.

Computer Methods and Programs inBiomedicine , 129:203–216, jun 2016.21. Jos´e A. Perea and John Harer. Sliding windows and persistence: An application of topologicalmethods to signal analysis.

Foundations of Computational Mathematics , pages 1–40, 2015.22. Allan Rechtschaffen and Anthony Kales. A manual of standardized terminology, techniquesand scoring system for sleep stages of human subjects: A. rechtschaffen and a. kales (editors).(public health service, u.s. government printing ofﬁce, washington, d.c.). 1968.lassifying sleep states using persistent homology and Markov chain: a Pilot Study 2123. Nathaniel Saul and Chris Tralie. Scikit-tda: Topological data analysis for python, 2019.24. Johan AK Suykens and Joos Vandewalle. Least squares support vector machine classiﬁers.

Neural processing letters , 9(3):293–300, 1999.25. Floris Takens. Detecting strange attractors in turbulence. In

Lecture Notes in Mathematics ,pages 366–381. Springer Berlin Heidelberg, 1981.26. Christopher Tralie. High-dimensional geometry of sliding window embeddings of periodicvideos. In . SchlossDagstuhl-Leibniz-Zentrum fuer Informatik, 2016.27. Christopher Tralie, Nathaniel Saul, and Rann Bar-On. Ripser.py: A lean persistent homologylibrary for python.

The Journal of Open Source Software , 3(29):925, Sep 2018.28. Christopher J Tralie and Jose A Perea. (quasi) periodicity quantiﬁcation in video data, usingtopology.

SIAM Journal on Imaging Sciences , 11(2):1049–1077, 2018.29. Christian H Weiss.