[PDF] Automated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments

Abstract

Stroke is known as a major global health problem, and for stroke survivors it is key to monitor the recovery levels. However, traditional stroke rehabilitation assessment methods (such as the popular clinical assessment) can be subjective and expensive, and it is also less convenient for patients to visit clinics in a high frequency. To address this issue, in this work based on wearable sensing and machine learning techniques, we developed an automated system that can predict the assessment score in an objective manner. With wrist-worn sensors, accelerometer data was collected from 59 stroke survivors in free-living environments for a duration of 8 weeks, and we aim to map the week-wise accelerometer data (3 days per week) to the assessment score by developing signal processing and predictive model pipeline. To achieve this, we proposed two types of new features, which can encode the rehabilitation information from both paralysed/non-paralysed sides while suppressing the high-level noises such as irrelevant daily activities. Based on the proposed features, we further developed the longitudinal mixed-effects model with Gaussian process prior (LMGP), which can model the random effects caused by different subjects and time slots (during the 8 weeks). Comprehensive experiments were conducted to evaluate our system on both acute and chronic patients, and the results suggested its effectiveness.

Full PDF

AAutomated Stroke Rehabilitation Assessment using WearableAccelerometers in Free-Living Environments

XI CHEN,

School of Mathematics, Statistics & Physics, Newcastle University, UK

YU GUAN,

Open Lab, School of Computing, Newcastle University, UK

JIAN QING SHI ∗ , School of Mathematics, Statistics & Physics, Newcastle University, UK

XIU-LI DU,

School of Mathematial Sciences, Nanjing Normal University, China, and School of Mathematics,Statistics & Physics, Newcastle University, UK,

JANET EYRE,

Institute of Neurosciences, Newcastle University, UK

Stroke is known as a major global health problem, and for stroke survivors it is key to monitor the recovery levels. However,traditional stroke rehabilitation assessment methods (such as the popular clinical assessment) can be subjective and expensive,and it is also less convenient for patients to visit clinics in a high frequency. To address this issue, in this work based onwearable sensing and machine learning techniques, we developed an automated system that can predict the assessment scorein an objective and continues manner. With wrist-worn sensors, accelerometer data was collected from 59 stroke survivors infree-living environments for a duration of 8 weeks, and we aim to map the week-wise accelerometer data (3 days per week) tothe assessment score by developing signal processing and predictive model pipeline. To achieve this, we proposed two newfeatures, which can encode the rehabilitation information from both paralysed/non-paralysed sides while suppressing thehigh-level noises such as irrelevant daily activities. We further developed the longitudinal mixed-effects model with Gaussianprocess prior (LMGP), which can model the random effects caused by different subjects and time slots (during the 8 weeks).Comprehensive experiments were conducted to evaluate our system on both acute and chronic patients, and the resultssuggested its effectiveness.Additional Key Words and Phrases: Functional ability of upper limbs, wrist-worn accelerometer sensor, stroke rehabilitation,CAHAI score, wavelet features, longitudinal data analysis

ACM Reference Format:

Xi Chen, Yu Guan, Jian Qing Shi, Xiu-Li Du, and Janet Eyre. 2020. Automated Stroke Rehabilitation Assessment using WearableAccelerometers in Free-Living Environments. 1, 1 (September 2020), 22 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn

It is widely known that stroke is a worldwide health problem causing disability and death [13], and it occurs whena blood clot cuts off oxygen supply to a region of the brain. Hemiparesis is a very common symptom of post-strokethat is the fractional or intact paralysis of one side of the body, i.e., the opposite side to where the blood clotoccurred, and it results in difficulties in performing activities, e.g., reduced arm movement. Patients can recoversome of their capabilities with intense therapeutic input, so it is important to assess their recovery levels in time. ∗ The corresponding authorAuthors’ addresses: Xi Chen, School of Mathematics, Statistics & Physics, Newcastle University, Newcastle, UK; Yu Guan, Open Lab, Schoolof Computing, Newcastle University, Newcastle, UK; Jian Qing Shi, School of Mathematics, Statistics & Physics, Newcastle University,Newcastle, UK; Xiu-Li Du, School of Mathematial Sciences, Nanjing Normal University, China, and School of Mathematics, Statistics &Physics, Newcastle University, UK, , ; Janet Eyre, Institute of Neurosciences, Newcastle University, Newcastle, UK.ACM acknowledges that this contribution was authored or co-authored by an employee, contractor, or affiliate of the United States government.As such, the United States government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to doso, for government purposes only.© 2020 Association for Computing Machinery.XXXX-XXXX/2020/9-ART $15.00https://doi.org/10.1145/nnnnnnn.nnnnnnn , Vol. 1, No. 1, Article . Publication date: September 2020. a r X i v : . [ ee ss . SP ] S e p • X. Chen et al. There are many approaches to assess patients’ recovery levels including brain imaging [34], questionnaire-based[14], and lab-based clinical assessment [6].The brain imaging technique, is deemed as one of the most reliable approach, which can provide the informationof brain hemodynamics [34]. However, this approach requires special equipment and is very expensive in cost.Questionnaire-based approaches investigate the functional ability during a period using questionnaires, and itcan be categorised into two types: patient-completed and caregiver-completed [14]. Although it is much cheaperthan brain imaging approaches, it may contain high-level of bias. For instance, patients may not remember theirdaily activities (i.e.,recall bias); the caregivers may not be able to observe the patient all the time. These biasesmake questionnaire-based approaches less precise. Lab-based clinical assessment approaches [6][4], on the otherhand, provide an alternative solution. The patients’ upper limb functionality will be assessed by clinicians, e.g.,by observing patients’ capabilities of finishing certain pre-defined activities [6]. Compared with braining imagingor questionnaire-based approaches, the cost of lab-based clinical assessment approaches is reasonable with highaccuracy. However, this assessment is normally taken in clinics/hospitals, which is not convenient for the patients,making continuous monitoring less feasible.In this work, we aim to build an automated stroke rehabilitation assessment system using wearable sensing andmachine learning techniques. Different from the aforementioned approaches, our system can measure the patientsobjectively and continuously in free-living environments. We collected accelerometer data using wrist-wornaccelerometer sensors, and designed compact features that can capture rehabilitation-related movements, beforemapping these features to clinical assessment scores (i.e., the model training process). The trained model can beused to infer recovery-level for other unknown patients. In free living environments, there are different types ofmovements which may be related to different frequencies. For example, activities such as running or jumping maycorrespond to high-frequency signal, while sedentary or eating may be low-frequency signal. In this study, insteadof recognising the daily activities, which is hard to achieve given limited annotation (i.e., no frame/sample-wiseannotation), we transformed the raw accelerometer data to the frequency domain, where we design featuresthat can encode the rehabilitation-related movements. Specifically, wavelet transform [12] was used, and thewavelet coefficients can represent the particular frequency information at certain decomposition scales. In [26],Preece et al. provided some commonly used wavelet features extracted from accelerometer data. However, tocapture stroke rehabilitation-related activities, some domain knowledge should be taken into account to designbetter features. After stroke, patients have difficulties in moving one side (i.e., paralysed side) due to the braininjury, and data from paralysed side tends to describe more about the upper limb functional ability, than the non-paralysed side (i.e., normal side). However, such signals can be significantly affected by personal behaviours orirrelevant daily activities, and such noises should be suppressed before developing the predictive models. Variouswavelet features were studied, and we proposed two new types of features that can encode information fromboth paralysed/non-paralysed sides, before developing predictive models for stroke rehabilitation assessment.Specifically, in this work our contributions can be summarised as follows:

Data and System

The data was collected from 59 stroke survivors (26 acute patients, and 33 chronic patients)for a duration of 8 weeks. Both wrist-worn accelerometer data (3 days per week in free living environment)and the clinically assessment scores were collected, based on which we developed a system that can provideautomated rehabilitation assessment mechanism in an objective and continuous manner.

Features

We proposed two new types of compact wavelet-based features that can encode information fromboth paralysed and non-paralysed sides to represent upper limb functional abilities for stroke rehabilitationassessment. It can also significantly suppress the influences of personal behaviours or irrelevant dailyactivities for data collected in the noisy free-living environment. , Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 3

Predictive Models

We developed the predictive models using two approaches: linear fixed-effects model(LM)and longitudinal mixed-effects model with Gaussian process prior (LMGP). Compared with LM, LMGP canmodel the random effects due to the heterogeneity nature among subjects in this 8-week longitudinal study.

Experiments

Comprehensive experiments were designed to study the effectiveness of our system. Weempirically demonstrated the results with respect to feature subset on modelling the mixed-effects of LMGP.Our system was evaluated on both acute and chronic patients, and the results suggested the effectivenessof our approach.

As described in Sec.1, lab-based clinical assessment was one of the most effective stroke rehabilitation assessmentmethods. In this section, we introduce of the the lab-based approach named Chedoke Arm and Hand ActivityInventory (CAHAI) scoring, based on which our automated system can be developed. Some sensing and machinelearning techniques for automated health assessment were also introduced.

Fig. 1. Left: Clinical behaviour assessment for CAHAI scoring; Right: The CAHAI score form [5].

CAHAI scoring is a clinical assessment method for stroke rehabilitation, and it is a fully validated measure[5] of upper limb functional ability with 9 tasks which are scored by using a 7-point quantitative scale. In theassessment, the patient will be asked to perform 9 tasks, including opening a jar of coffee, drawing a line with aruler, calling 911, etc. and the clinician will score these behaviours based on patient’s performance at a scale from1 (total assist weak) to 7 (complete independence i.e., timely, safely) [5]. A task example "call 911" is shown in Fig.1. Thus the minimum and maximum summation scores are 7 and 63 respectively.

Recently, wearable sensing and machine learning (ML) techniques were comprehensively studied for automatedhealth assessment. Compared with the traditional assessment approaches (e.g., via self-reporting, clinical as-sessment, etc.) which are normally subjective and expensive, the automated systems may provide an objective,low-cost alternative, which can also be used for continuous monitoring/assessment. Some automated systemswere developed to assess the behaviours of diseases such as Parkinson’s disease [28] [19], autism [24], depression[23]; or to monitor the health status such as sleep [35], fatigue [3], etc. , Vol. 1, No. 1, Article . Publication date: September 2020. • X. Chen et al.

After collecting behaviour or physiological signals (e.g., accelerometers, ECG, audio, etc.), assessment/monitoringmodels can be developed. For better performance, feature engineering can be considered as an important stepbefore ML model development, which can reduce data redundancy while improving the discrimination capacity[24]. However, it tends to be time-consuming and sometimes may require domain knowledge [35] [3] [28] forinterpretable/clinically-relevant feature extraction. On the other hand, when interpretability is less required,deep learning methods can be directly applied to the raw signal for predictive model development [19] [35] [3][23], yet it normally requires adequate data annotation for better model generalisation.

With the rapid development of the sensing/ML techniques, researchers also started to apply various sensorsfor stroke rehabilitation monitoring. In [11], Kinect sensor was used in a home-like environments to detect thekey joints such that stroke patients’ behaviour can be assessed. In [15], a wireless surface Electromyography(sEMG) device was used to monitor the muscle recruitment of the post-stroke patients to see the effect of orthoticintervention. In clinical environments, five wearable sensors were placed on the trunk, upper and forearmof the two upper limbs to measure the reaching behaviours of the stroke survivors [21]. To monitor motorfunctions of stroke patients during rehabilitation sessions at clinics, an ecosystem including a jack and a cubefor hand grasping monitoring, as well as a smart watch for arm dynamic monitoring was designed [7]. Thesetechniques can objectively assess/measure the behaviours of the stroke patients, yet they are either limited toclinical environments [7][21] [15] or constrained environments (e.g., in front of a camera [11]). Most recently,wrist-worn sensors were used for stroke rehabilitation monitoring for patients in free-living environment [18][33]. However, the collected data included high-level of data redundancy, and it is challenging to extract compactand discriminant features for reliable assessment.

In this section, we introduced our method from data collection, data pre-processing, feature design to predictivemodels. Our aim is to develop an automated model which can map the free-living 3-day accelerometer data intothe CAHAI score. With the trained model, we can infer the CAHAI score in an objective and continuous manner.To achieve this, we first reduced the data redundancy via preprocessing and design compact and discriminantfeatures. Given the proposed features, a longitudinal mixed-effects model with Gaussian Process prior (LMGP)was developed, which can further reduce the impact of large variability (caused by different subjects and timeslots) for higher prediction results. The pipeline of our system is demonstrated in Fig. 2.

Participants.

Data was collected as part of a bigger research study which aims to use a bespoke, professionally-written video game as a therapeutic tool for stroke rehabilitation [30]. Ethical approval was obtained from theNational Research Ethics Committee and all work undertaken was in accordance with the Declaration of Helsinki.Written, informed consent from all the subjects was obtained. A cohort of 59 stroke survivors, without significantcognitive or visual impairment, were recruited for the study. Patients were divided into two groups, i.e., • Group 1 : the acute patient group, consisting of 26 participants who enrolled into the study within 6 monthsafter stroke; • Group 2 : the chronic patient group, was formed by 33 participants who were 6 months or more post onsetof stroke.It is also worth noting that these 59 patients have different paralysed sides. They visited the clinic for the CAHAIscoring every week (a random day in weekdays) for a duration of 8 weeks. In the 8 weeks, they were asked towear two wrist-worn sensors for 3 full days (including night time) a week. They were also advised to remove , Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 5 the device during shower or swimming. Since some patients needed time to get familiar to this data collectionprocedure, for better data quality we did not use the first week’s accelerometer data. The first week’s CAHAIscores were used as medical history information.

Fig. 2. The pipeline of our automated stroke rehabilitation assessment systemFig. 3. The accelerometer data collected from AX3 device.

Data collection.

In contrast to other afore-mentioned sensing techniques [21][7][15][11], in this study wecollected the accelerometer data from wrist-worn sensors in free-living environments. The sensor used for thisstudy, i.e., AX3 [1], is a triaxial accelerometer logger that was designed for physical activity/behaviour monitoring, , Vol. 1, No. 1, Article . Publication date: September 2020. • X. Chen et al. and it has been widely used in the medical community (e.g., for the UK Biobank physical activity study [10]). Thewrist bands were also designed such that the users can comfortably wear it without affecting their behaviours.The data was collected at 100Hz sampling rate, which can well preserve the daily activities of human being [8].Different from human activity recognition which requires sample-wise or frame-wise annotation [17] [25], thedata collection in this study is relatively straight-forward. The patients put on both wrist-worn sensors 3 fulldays a week, before visiting clinicians for CAHAI scoring (i.e., week-wise annotation). In other words, we aimto use accelerometer data captured in free-living environments to represent the stroke survivors’ upper limbactivities to measure the degree of paresis [20] (i.e., CAHAI score).One problem with most commercial sensors is that only summary data (e.g., step count from fitbit), instead ofraw data, are available. The algorithms of producing summary data are normally non-open source, and may varyfrom vendor to vendor – making the data collection and analysis device-dependent, and thus less practical interms of generalisation and scalability. The AX3 device used in this study, on the other hand, outputs the rawacceleration information in x, y, z directions. In Fig. 3, we visualise the raw acceleration data (measured in g units1g = . m / s ) collected from both wrists. It is simple and transparent, making the collected data re-usable, whichis crucial for research communities. For accelerometer data, signal vector magnitude (VM) [22] is a popular representation, which is simply themagnitude of the triaxial acceleration data defined as follows: a ( t ) = (cid:113) a x ( t ) + a y ( t ) + a z ( t ) , where a x ( t ) , a y ( t ) , a z ( t ) are the acceleration along the x, y, z axes at timestamp t . The gravity effect can be removedby V M ( t ) = | a ( t ) − | . Because its simplicity and effectiveness, VM has been widely used in health monitoring tasks, such as fall detection[22], physical activity monitoring [10], perinatal stroke assessment [16], etc. To further reduce the data volume,we used second-wise VM, i.e., the mean VM over each second (including 100 samples per second) will be used asnew representation. Some second-wise VM examples can be found in Fig. 4.

We aim to build a model that can map the 3-day time-series data to the CAHAI score. The original input data,however, is very big. Even if we used the second-wise VM data, in each trial it still included roughly 3 days × × = , Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 7 Fig. 4. VM data of two patients (paralysed side).Fig. 5.

SAD representation of two patients (paralysed side)

For time-series analysis, wavelet analysis is a powerful tool to represent various aspects of non-stationary signalssuch as trends, discontinuities, and repeated patterns [2] [12] [26], which is especially useful in signal compressionor noise reduction. Given its properties, wavelet features have been widely used in accelerometer-based dailyliving activity analytics [2]. In this work, we used discrete wavelet transform (

DWT ) and discrete wavelet packettransform (

DWPT ) as feature extractors, based on which new features were designed to preserve the strokerehabilitation-related information. More details of

DWT and

DWPT can be found at Appendix 5.1. After applyingthe

DWT and

DWPT , VM signals can be transformed to the wavelet coefficients at different decompositionscales. Specifically, we used

SAD representation, and its entry

SAD j (normalised Sum of Absolute value of DWTcoefficients at scale j ) can preserve the energy of the daily activities at different decomposition scale j , where j ∈ { . , . , . , . , , , , , , } in this work. The technical details of SAD representation can be found inAppendix 5.2. Through wavelet transformation, the long sequence (e.g., VM data in Fig. 4) can be transformedinto compact representation (e.g.,

SAD features in Fig. 5), based on which we further design features for CAHAIscore regression.

Based on the compact

SAD representation, we aim to further design effective features for reliable CAHAI score , Vol. 1, No. 1, Article . Publication date: September 2020. • X. Chen et al. regression. To have a better understanding of the behaviour patterns, we selected two different patients (in termsof recovery level) for visualisation.First, we visualised the paralysed side of patient la012 (with CAHAI score 55), and la040 (with CAHAI 26)using VM representation (Fig. 4) and

SAD representation (Fig. 5). From both figures, we can see the limitations ofboth representations. Although VM can demonstrate distinct patterns from both patients, it may be also relatedto the large intra-class variability (e.g., personalised behaviour patterns). Moreover, the redundancy as wellas the high-dimensionality make it hard for modelling. On the other hand,

SAD has low dimensionality, yetboth patients exhibited high-level of similarity, indicating

SAD of the paralysed side alone is not enough fordistinguishing patients with different recovery levels.

Fig. 6.

SAD representation of two patients (with both paralysed/non-paralysed sides)Fig. 7. Two proposed

PNP representations for two patients

Given the observations, we further visualised

SAD features from both paralysed/non-paralysed sides for bothpatients in Fig.6. We can see patient la012 (with high recovery level) uses both hands (almost) equally whilepatient la040 (with low recovery level) tends to use the non-paralysed side more. These observations motivatedus to design new features from both sides, instead of the paralysed side alone. Specifically, we proposed two , Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 9 types of features, which encode the ratio information between paralysed side and non-paralysed side, namely,

PN P k and PN P k : PN P k = SAD pk SAD npk

PN P k = SAD npk − SAD pk SAD npk + SAD pk , where k = . , . , . , . , , , , , , p and np referred to the paralysed side and non-paralysed siderespectively. We also visualised patient la012 and patient la040 using the new proposed features PNP and PNP in Fig. 7, from which we can see the proposed features can well distinguish these two patients, in contrast tobasic wavelet features SAD (Fig. 5). Based on these distinguishable low-dimensional features, it is feasible tobuild CAHAI regression models.We listed 4 types of features, i.e., the original wavelet features extracted from paralysed (

SAD p ) and non-paralysed sides ( SAD np ) separately, as well as the two new proposed features ( PNP and PNP ). Based on 10scales, we can form 40-dimensional feature vector, as shown in Table 1. However, there exist certain level ofnoises and redundancy (especially on SAD p , and SAD np ), so it is crucial to develop feature selection mechanismor powerful prediction model for higher performance.Feature type Dimension feature entries for each type SAD p SAD p . , SAD p . , SAD p . , SAD p . , SAD p , SAD p , ... , SAD p SAD np SAD np . , SAD np . , SAD np . , SAD np . , SAD np , SAD np , ... , SAD np PNP PN P . , PN P . , PN P . , PN P . , PN P , PN P , ... , PN P PNP PN P . , PN P . , PN P . , PN P . , PN P , PN P , ... , PN P Table 1. The wavelet features at 10 scales.

Based on the proposed wavelet features, we aim to develop predictive models that can map features to the CAHAIscore. Although we reduced the data redundancy significantly, there still exists data noises, which may encodeirrelevant information. It is crucial to develop robust mechanism to select the most relevant features, and here weused a popular feature selection linear model (LASSO). To model the nonlinear random effects in the longitudinalstudy, we also proposed to use a Gaussian Process (GP) regression model.It is worth noting that our model will also take advantage of the medical history information (i.e., CAHAI scoreduring the first visit) to predict CAHAI scores for the rest 7 weeks (i.e., week 2 - week 8). From the perspectiveof practical application, CAHAI score from the initial week (referred to as ini ) may be used as an importantnormalisation factor for different individuals.

Since there may exist some redundant or irrelevant features for the prediction task, first we proposed to useLASSO (Least Absolute Shrinkage and Selection Operator) for feature selection.Given the 41-dimensional input variables (40 wavelet features and 1 CAHAI score from the initial week), firstwe standardised the data using z-norm, and each feature entry x k will be normalised as x newk = ( x k − x )/ s k ,where x and s k are the mean and standard deviation of the k th feature. Based on the aforementioned model, , Vol. 1, No. 1, Article . Publication date: September 2020. namely LASSO, useful features can be selected, based on which prediction model can be developed. For simplicity,we first used linear model to predict the target CAHAI score y i : y i = x T i β + ϵ i , ϵ i ∼ N ( , σ ) , (1)where i stands for the i th trial from patients (out of all patients during week 2 - week 8); x i represents the selectedfeature vector; β are the model parameter vector to be estimated, and ϵ i is the random noise term. It is simple to use linear model for CAHAI score prediction. However, it ignores the heterogeneity nature amongsubjects in this longitudinal study. To model the heterogeneity, we proposed to use a nonlinear mixed-effectsmodel [32], which consists of the fixed-effects part and random-effects part. Specifically, the random-effects partcontributes mainly on modelling the heterogeneity, making the the prediction process subject/time-adaptive forlongitudinal studies. The longitudinal mixed-effects model with Gaussian Process prior (LMGP) is defined asfollows: y i , j = x T i , j β + д ( ϕ i , j ) + ϵ i , j , ϵ i , j ∼ N ( , σ ) , (2)where i , j stand for the i th patient at the j th visit (from week 2 to week 8); ϵ i , j refers to as independent random errorand σ is its variance; In Eq(2), x Ti , j β is the fixed-effects part and д ( ϕ i , j ) represents the nonlinear random-effectspart, and the latter can be modelled using a non-parametric Bayesian approach with a GP prior [32].It is worth noting that in LMGP the fixed-effects part x T i , j β explains a linear relationship between inputfeatures and CAHAI, while the random-effects part д ( ϕ i , j ) is used to explain the variability caused by differencesamong individuals or time slots during different weeks. By considering both parts, LMGP provides a solution ofpersonalised modelling for this longitudinal data analysis. In LMGP, it is important to select input features tomodel both parts, and we referred them to as fixed-effects features and random-effects features, respectively. Theeffect of the fixed-effects features will be studied in the experimental evaluation section.For LMGP training, we first ignored the random-effects part, and only optimised the parameters ˆ β of thefixed-effects part (via ordinary least squares OLS); With estimated parameters ˆ β , the residual r ij = y ij − x T i , j ˆ β = д ( ϕ i , j ) + ϵ i , j can be calculated, from which we can model the random-effects д ( ϕ i , j ) ∼ GP ( , K (· , · ; θ )) . In this paper we chose K (· , · ; θ ) as the following squared exponential (covariance) kernel function: K (· , · ; θ ) = v exp (cid:40) − Q (cid:213) q = w q (cid:16) ϕ i , j , q − ϕ ′ i , j , q (cid:17) (cid:41) , and the model parameters θ = ( v , w , ..., w Q ) can be estimated by using an empirical Bayesian method (and inthis work we used the R package GPFDA ). It is worth noting that other covariance kernel functions can also beapplied and more information can be found in [27] [31].Let C ∈ R J × J be covariance matrix of д ( ϕ i , j ) , where J referred to the total visit number for each subject, and C can be calculated based on the kernel function and its parameters θ . Then the fitted value in random-effects partcan be calculated by ˆ д ( ϕ i , j ) = C (cid:0) C + σ I (cid:1) − r ij , where I is the identity matrix. The variance of the random-effects part is given by:Var ( ˆ д ( ϕ i , j )) = σ (cid:0) C + σ I (cid:1) − C . Finally, the fitted value of LMGP can be given by:ˆ y ij = x Ti , j ˆ β + C (cid:0) C + σ I (cid:1) − r ij . , Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 11 Given that, the 95% predictive intervals can be written as: [ ˆ y ij − . (cid:113) Var ( ˆ д ( ϕ i , j )) , ˆ y ij + . (cid:113) Var ( ˆ д ( ϕ i , j ))] . In this section, several experiments were designed to evaluate the proposed features as well as the proposedprediction system. The patients were splitted into two groups according to the disease nature, i.e., the acutepatient group (26 subjects) and the chronic patient group (33 subjects), and experiments were conducted on bothgroup separately. Since CAHAI score prediction is a typical regression problem, we used the root mean squareerror (RMSE) as the evaluation metric, and lower mean RMSE values indicate better performance.

PNP

In this subsection, we evaluated the effectiveness of the proposed

PNP features. One most straight-forwardapproach is to calculate the correlation coefficients against the target CAHAI scores. In Table 2 we reportedthe corresponding correlation coefficients (

PN P k , and PN P k in 10 scales) for acute/chronic patients group. Thecorrelation coefficients of the original wavelet features (with paralysed side SAD pk , and non-paralysed side SAD pk in 10 scales) against CAHAI score were also reported for comparison. From Table 2, we can see: • PNP features generally have higher correlation coefficients (than

SAD ) against the CAHAI scores. • for PNP features, from Scale k = . k = • for chronic patients, SAD features (on the non-paralysed side) exhibit comparable correlation scores with

PNP features.These observations indicate the necessities of selecting useful features on building the prediction system. Although

PNP demonstrates more powerful prediction capacity, in some cases,

SAD (e.g., extracted from the non-paralysedside) may also provide important information for a certain population (e.g., chronic patients).- Acute Patients Chronic PatientsScale (k)

SAD pk SAD npk

PN P k PN P k SAD pk SAD npk

PN P k PN P k k=1.1 -0.41 0.32 0.68 -0.70 0.22 0.49 0.56 -0.56k=1.2 -0.42 0.33 0.69 -0.71 0.24 0.50 0.57 -0.56k=1.3 -0.43 0.32 0.70 -0.72 0.23 0.51 0.58 -0.57k=1.4 -0.42 0.33 0.69 -0.71 0.24 0.51 0.57 -0.57k=2 -0.42 0.31 0.69 -0.71 0.23 0.50 0.56 -0.55k=3 -0.42 0.27 0.67 -0.68 0.25 0.50 0.53 -0.52k=4 -0.43 0.20 0.60 -0.63 0.26 0.50 0.48 -0.47k=5 -0.42 0.10 0.49 -0.52 0.27 0.50 0.43 -0.42k=6 -0.37 -0.01 0.35 -0.38 0.27 0.48 0.35 -0.34k=7 -0.30 -0.10 0.19 -0.20 0.28 0.45 0.25 -0.24 Table 2. Correlation coefficients of the wavelet features and CAHAI score.

For better understanding the relationship between these features, we also reported the cross-correlationbetween each feature pairs. Noting we also included the medical history feature, i.e., the initial week-1 CAHAIscore. From Fig. 8, and we have the following observations: , Vol. 1, No. 1, Article . Publication date: September 2020. (a) Acute patients. (b) Chronic patients.

Fig. 8. Cross-correlation of the candidate features • For both patient groups, the

PNP features are highly correlated.

PNP features within the same type (

PNP or PNP ) tend to be positively correlated, while PNP features from different types tend to be negativelycorrelated. • For acute patients,

SAD features for each side (paralysed side

SAD p or non-paralysed side SAD np ) arehighly (positively) correlated, yet the SAD features from different sides are less correlated. For chronicpatients, however,

SAD features from both sides are highly (positively) correlated. • In general,

PNP features,

SAD features and the medical history information ini are less correlated, indicatingthem as potentially complementary information to be fused.Based on the above findings, it is clear that within each feature types, there may exist high-level of featureredundancy, and it is necessary to select the most relevant feature subsets. For acute and chronic patient groups,the optimal feature subset may vary due to the different movement patterns (e.g., on paralysed/non-paralysedsides). Although the proposed

PNP features can alleviate this problem to some extent, it is beneficial to combinethe less correlated features (i.e.,

PNP , SAD , and ini ), and due to the feature redundancy, it is crucial to extractcompact representation for the prediction model development.

Based on the feature correlation analysis in Sec. 4.1, it is important we select the most relevant features fromvarious sources (i.e.,

PNP , SAD , and ini ). Different from the correlation-based approach which can select eachfeature independently (by the correlation coefficient), LASSO can select the feature by solving a linear optimisationproblem with sparsity constraint, and it takes the relationship of the features into consideration. Based on LASSOwe selected the most important features for both acute/chronic patients, as shown in Table 3. , Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 13

Acute Patients Chronic Patients

PN P , PN P , SAD np , SAD p . SAD np , ini PN P . , SAD p , SAD np , PN P . PN P , PN P . , ini , PN P SAD np . , SAD np Table 3. Selected features using LASSO

Based on the selected features, we further performed leave-one-patient-out cross validation on these two patientgroups respectively using the linear fixed-effects model. As shown in Fig. 10, the prediction results of the chronicpatients (with mean RMSE 3.29) tend to be much better than the ones of the acute group (with mean RMSE7.24). One of the main reasons might be the nature of the patient group. In Fig. 9, we plotted the clinical CAHAIdistribution (i.e., the ground truth CAHAI) from week 2 to week 8, and we can see the clinical CAHAI scores arevery stable for chronic patients. On the other hand, for acute patients who suffered from stroke in the past 6months, their health statuses were less stable and affected significantly by various factors, and in this case thesimple linear fixed-effected model yields less promising results.

Fig. 9. Clinical assessed CAHAI distribution with respect to visitFig. 10. Linear model prediction vs clinical CAHAI; Left: Acute patients (RMSE 7.24); Right: Chronic patients (RMSE 3.29). , Vol. 1, No. 1, Article . Publication date: September 2020.

Fig. 11. LMGP prediction vs clinical CAHAI; Left: Acute patients (RMSE 5.75); Right: Chronic patients (RMSE 3.12)

Based on the selected features, we also developed LMGP for both patient groups. Here, we used the selectedfeatures (from Table 3) as the fixed-effects features and random-effects features. Similar to the linear fixed-effectsmodel, we evaluated the performance based on leave-one-patient-out cross validation, and the mean RMSE valueswere reported in Fig. 11, from which can see LMGP can further reduced the errors when compared with thefixed-effects linear model, with mean RMSE 5.75 for acute patients and 3.12 for chronic patients, respectively.Based on LMGP, we also performed "continuous monitoring" on 4 patients (two for each patient group) fromweek 2 to week 8 by predicting the week-wise CAHAI scores (with mean and 95% confidence interval) in Fig. 12,which is extremely helpful when uncertainty measurement is required.

Fig. 12. Continues monitoring using LMGP for 4 patients (top: two chronic patients; bottom: two acute patients) , Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 15

LMGP includes two key parts, i.e., the linear fixed-effects and the non-linear random-effects part, and it isimportant to choose the key features for modelling. Since the fixed-effects part measures the main (linear)relationship between the input features and the predicted CAHAI, we studied the corresponding feature subsets.For random-effects part, we used the full LASSO features (as shown in Table 3).To select the most important feature subset for the fixed-effects part modelling, we ranked the features (fromTable 3) based on two criteria: LASSO coefficients, and correlation coefficients (between features and CAHAI,as described in Sec.4.1). Table 4 demonstrates ranked features, and here only the top 50% features (i.e., top 3features for acute patients and top 5 features for chronic patients) were used to model the fixed-effects part, andthe settings as well as the results were reported in Table 5.Feature Ranking Criterion Acute Patients Chronic PatientsLASSO Coefficients(absolute value)

PN P , PN P , SAD np , SAD p . SAD np , ini PN P . , SAD p , SAD np , PN P . PN P , PN P . , ini , PN P SAD np . , SAD np Correlation Coefficients(absolute value)

PN P , ini , SAD p . , PN P SAD np , SAD np ini , PN P . , PN P . , PN P . SAD np . , SAD np , PN P , SAD np PN P , SAD p Table 4. Feature importance ranking for acute/chronic patients.

AcutePatients Fixed-effects features Random-effects features RMSE full 6 features in Table 3 full 6 features in Table 3 5.75top 3 features (Corr criterion in Table 4):

PN P , ini , SAD p . full 6 features in Table 3 5.37top 3 features (LASSO criterion in Table 4): PN P , PN P , SAD np full 6 features in Table 3 5.51 ChronicPatients Fixed-effects features Random-effects features RMSE full 10 features in Table 3 full 10 features in Table 3 3.12top 5 features (Corr criterion in Table 4): ini , PN P . , PN P . PN P . , SAD np . full 10 features in Table 3 3.20top 5 features (LASSO criterion in Table 4): PN P . , SAD p , SAD np PN P . , PN P full 10 features in Table 3 5.12 Table 5. LMGP’s fixed-effects part modelling results (RMSE) based on different feature subsets

It is interesting to observe the performance may change substantially based on different settings. Specifically,with the top feature subsets, modelling the LMGP’s fixed-effects part can further reduce the errors for acutepatients, in contrast to chronic patients with increased errors. The top 5 features selected via the LASSO criterionyields the worst performance for chronic patients, and one possible explanation could be due to the lack of feature ini —–the initial health condition—–a major attribute for chronic patient modelling (see Fig. 9). , Vol. 1, No. 1, Article . Publication date: September 2020.

Models RMSE (Acute) RMSE (Chronic)Support vector regression (linear) 7.47 3.25Support vector regression (rbf) 9.67 4.92Random forest regression 8.19 3.93Linear fixed-effects model 7.24 3.29LMGP (ours) 5.75 3.12

Table 6. Model comparison

Base on the selected features in Table 3, we also compared our model with other baselines such as supportvector regression (SVR) and random forest regression(RF) for acute/chronic patient groups. Leave one subjectout cross validation and mean RMSE was used to measure the performance. We observed that linear modelsyielded reasonable results (linear SVR and linear fixed-effects model), while the non-linear baselines (SVR withrbf kernel, and RF) had the worse performance. One of the explanation is the over-fitting effect, where the trainednon-linear models do not generalise well to the unseen patients/environments in this longitudinal study setting.RF’s performance may also suffer significantly from the low-dimensionality of the selected features (6 featuresfor acute patients and 10 features for chronic patients). Given the simplicity of the linear models and the designedlow-dimensional features, linear models tend to suffer less from the over-fitting effect, with reasonable results inthese challenging environments. Compared with the baselines, our LMGP can further model the longitudinalmixed-effects (i.e., with linear fixed-effect part and non-linear random-effects part), making the system adaptiveto different subjects/environments, with the lowest errors.

In this work, we developed an automated stroke rehabilitation assessment system using wearable sensing andmachine learning techniques. We collected accelerometer data using wrist-worn sensors, based on which webuilt models for CAHAI score prediction, which can provide objective and continuous rehabilitation assessment.To map the long time-series (i.e., 3-day accelerometer data) to the CAHAI score, we proposed a pipeline whichperformed data cleaning, feature design, to predictive model development. Specifically, we proposed two compactfeatures which can well capture the rehabilitation characteristics while suppressing the irrelevant daily activities,which is crucial on analysing the data collected in free-living environments. We further employed LMGP, whichcan make the model adaptive to different subjects and different time slots (across different weeks). Comprehensiveexperiments were conducted on both acute/chronic patients, and very promising results were achieved, especiallyon the chronic patient group. We also studied different feature subsets on modelling the fixed-effects part inLMGP, and experiments suggested the errors can be further reduced for the challenging acute patient population.

REFERENCES [1] Axivity Ltd. [n.d.]. AX3, 3-Axis Logging Accelerometer. https://axivity.com/product/ax3. [Online; accessed July-2020].[2] Fouaz S Ayachi, Hung P Nguyen, Catherine Lavigne-Pelletier, Etienne Goubault, Patrick Boissy, and Christian Duval. 2016. Wavelet-based algorithm for auto-detection of daily living activities of older adults captured by multiple inertial measurement units (IMUs).

Physiological measurement

37, 3 (March 2016), 442âĂŤ461. https://doi.org/10.1088/0967-3334/37/3/442[3] Yang Bai, Yu Guan, and Wan-Fai Ng. 2020. Fatigue Assessment Using ECG and Actigraphy Sensors. In

Proceedings of the 24rd InternationalSymposium on Wearable Computers (ISWC) . Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3410531.3414308, Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 17 [4] S. Barreca, P. Stratford, C. Lambert, L. Masters, and D. Streiner. 2005. Test-Retest Reliability, Validity, and Sensitivity of the ChedokeArm and Hand Activity Inventory: a New Measure of Upper-Limb Function for Survivors of Stroke.

Arch Phys Med Rehabil

86 (2005),1616–1622.[5] S. Barreca, P. Stratford, L. Masters, C. Lambert, J. Griffiths, and C. McBay. 2006. Validation of Three Shortened Versions of the ChedokeArm and Hand Activity Inventory.

Physiother. Can.

58 (2006), 1–9.[6] Susan R Barreca, Paul W. Stratford, Cynthia L. Lambert, Lisa M. Masters, and David L Streiner. 2005. Test-retest reliability, validity, andsensitivity of the Chedoke arm and hand activity inventory: a new measure of upper-limb function for survivors of stroke.

Archives ofphysical medicine and rehabilitation

86 8 (2005), 1616–22.[7] Maxence Bobin, Franck Bimbard, Mehdi Boukallel, Margarita Anastassova, and Mehdi Ammi. 2019. SpECTRUM: Smart ECosystem forsTRoke patient’s Upper limbs Monitoring. 13 (02 2019). https://doi.org/10.1016/j.smhl.2019.01.001[8] Carlijn Bouten, Karel Koekkoek, Maarten Verduin, Rens Kodde, and Jan Janssen. 1997. A Triaxial Accelerometer and Portable DataProcessing Unit for the Assessment of Daily Physical Activity.

IEEE transactions on bio-medical engineering

44 (04 1997), 136–47.https://doi.org/10.1109/10.554760[9] I. Daubechies. 2006. Orthonormal bases of compactly supported wavelets.

Commun. Pure. Appl. Math (2006), 909–996.[10] Aiden Doherty, Dan Jackson, Nils Hammerla, Thomas PlÃűtz, Patrick Olivier, Malcolm H. Granat, Tom White, Vincent T. van Hees,Michael I. Trenell, Christoper G. Owen, Stephen J. Preece, Rob Gillions, Simon Sheard, Tim Peakman, Soren Brage, and Nicholas J.Wareham. 2017. Large Scale Population Assessment of Physical Activity Using Wrist Worn Accelerometers: The UK Biobank Study.

PLOS ONE

12, 2 (02 2017), 1–14. https://doi.org/10.1371/journal.pone.0169649[11] Elham Dolatabadi, Ying Zhi, Bing Ye, Marge Coahran, Giorgia Lupinacci, Alex Mihailidis, Rosalie Wang, and Babak Taati. 2017. The torontorehab stroke pose dataset to detect compensation during stroke rehabilitation therapy. 375–381. https://doi.org/10.1145/3154862.3154925[12] Andrew T. Walden Donald B. Percival. 2000.

Wavelet methods for time series analysis (1 ed.). Cambridge University Press. http://gen.lib.rus.ec/book/index.php?md5=6f8eb3e5445b7c12210cb5b1d50a40f3[13] G. Donnan, M. Fisher, M. Macleod, and S. Davis. 2008. Stroke.

Lancet

American journal of epidemiology

166 (11 2007), 832–40. https://doi.org/10.1093/aje/kwm148[15] A. C. Ganesh, B. S. Renganathan, C. Rajakumaran, S. P. Preejith, K. Shubham, J. Jayaraj, and S. Mohanasankar. 2018. Post-StrokeRehabilitation Monitoring Using Wireless Surface Electromyography: A Case Study. In . 1–6.[16] Yan Gao, Yang Long, Yu Guan, Anna Basu, Jessica Baggaley, and Thomas Ploetz. 2019. Towards Reliable, Automated General MovementAssessment for Perinatal Stroke Screening in Infants Using Wearable Accelerometers.

Proc. ACM Interact. Mob. Wearable UbiquitousTechnol.

3, 1, Article 12 (March 2019), 22 pages. https://doi.org/10.1145/3314399[17] Yu Guan and Thomas Plötz. 2017. Ensembles of Deep LSTM Learners for Activity Recognition Using Wearables.

Proc. ACM Interact.Mob. Wearable Ubiquitous Technol.

1, 2, Article 11 (June 2017), 28 pages. https://doi.org/10.1145/3090076[18] Shane Halloran, Lin Tang, Yu Guan, Jian Qing Shi, and Janet Eyre. 2019. Remote Monitoring of Stroke Patients’ Rehabilitation UsingWearable Accelerometers. In

Proceedings of the 23rd International Symposium on Wearable Computers (London, United Kingdom) (ISWC19) . Association for Computing Machinery, New York, NY, USA, 72âĂŞ77. https://doi.org/10.1145/3341163.3347731[19] Nils Y. Hammerla, James M. Fisher, Peter Andras, Lynn Rochester, Richard Walker, and Thomas Plotz. 2015. PD Disease State Assessmentin Naturalistic Environments Using Deep Learning. In

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (Austin,Texas) (AAAI) . AAAI Press, 1742âĂŞ1748.[20] Henrik [Stig JÃÿrgensen], Hirofumi Nakayama, Hans Otto Raaschou, and Tom [SkyhÃÿj Olsen]. 1999. Stroke: Neurologic andFunctional Recovery The Copenhagen Stroke Study.

Physical Medicine and Rehabilitation Clinics of North America

10, 4 (1999), 887 – 906.https://doi.org/10.1016/S1047-9651(18)30169-4 A New Century Approach to Stroke Management and Rehabilitation.[21] H. Jung, J. Park, J. Jeong, T. Ryu, Y. Kim, and S. I. Lee. 2018. A wearable monitoring system for at-home stroke rehabilitation exercises: Apreliminary study. In . 13–16.[22] D. M. Karantonis, M. R. Narayanan, M. Mathie, N. H. Lovell, and B. G. Celler. 2006. Implementation of a real-time human movementclassifier using a triaxial accelerometer for ambulatory monitoring.

IEEE Transactions on Information Technology in Biomedicine

10, 1(2006), 156–167.[23] Bethany Little, Ossama Alshabrawy, Daniel Stow, I. Ferrier, Roisin McNaney, Daniel Jackson, Karim Ladha, Cassim Ladha, ThomasPloetz, Jaume Bacardit, Patrick Olivier, Peter Gallagher, and John O’Brien. 2020. Deep learning-based automated speech detection as amarker of social functioning in late-life depression.

Psychological Medicine (01 2020), 1–10. https://doi.org/10.1017/S0033291719003994[24] Thomas Plötz, Nils Y. Hammerla, Agata Rozga, Andrea Reavis, Nathan Call, and Gregory D. Abowd. 2012. Automatic Assessment ofProblem Behavior in Individuals with Developmental Disabilities. In

Proceedings of the 2012 ACM Conference on Ubiquitous Computing (Pittsburgh, Pennsylvania) (UbiComp) . Association for Computing Machinery, New York, NY, USA, 391âĂŞ400. https://doi.org/10.1145/2370216.2370276[25] T. PlÃűtz and Y. Guan. 2018. Deep Learning for Human Activity Recognition in Mobile Computing.

Computer

51, 5 (2018), 50–59., Vol. 1, No. 1, Article . Publication date: September 2020. [26] S. J. Preece*, J. Y. Goulermas, L. P. J. Kenney, and D. Howard. 2009. A Comparison of Feature Extraction Methods for the Classificationof Dynamic Activities From Accelerometer Data.

IEEE Transactions on Biomedical Engineering

56, 3 (March 2009), 871–879. https://doi.org/10.1109/TBME.2008.2006190[27] CE. Rasmussen and CKI. Williams. 2006.

Gaussian Processes for Machine Learning . MIT Press, Cambridge, MA, USA. 248 pages.[28] Rana zia ur Rehman, Silvia Din, Yu Guan, Alison Yarnall, Jian Shi, and Lynn Rochester. 2019. Selecting Clinically Relevant GaitCharacteristics for Classification of Early Parkinson’s Disease: A Comprehensive Machine Learning Approach.

Scientific Reports

Proceedings of the 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Vol.20 BiomedicalEngineering Towards the Year 2000 and Beyond (Cat. No.98CH36286) , Vol. 3. 1523–1526 vol.3. https://doi.org/10.1109/IEMBS.1998.747177[30] J.Q. Shi, Y. Cheng, J. Serradilla, G. Morgan, C. Lambden, G. Ford, C. Price, H. Rodgers, T. Cassidy, L. Rochester, and J.A. Eyre. 2013.Evaluating Functional Ability of Upper Limbs after Stroke Using Video Game Data. In

International Conference on Brain and HealthInformatics (Lecture Notes in Artificial Intelligence, Vol. 8211) , K. Imamura, S. Usui, T. Shirao, T. Kasamatsu, L. Schwabe, and N. Zhong(Eds.). Springer, 181–192.[31] Jian Shi and Taeryon Choi. 2011.

Gaussian Process Regression Analysis for Functional Data . London: Chapman and Hall/CRC. https://doi.org/10.1201/b11038[32] J.Q. Shi, B. Wang, E.J. Will, and R.M. West. 2012. Mixed-effects Gaussian process functional regression models with application to doseresponse curve prediction.

Statistics in Medicine

31, 26 (2012), 3165–3177. https://doi.org/10.1002/sim.4502[33] Lin Tang, Shane Halloran, Jian Qing Shi, Yu Guan, Chunzheng Cao, and Janet Eyre. 2020. Evaluating upper limb function after strokeusing the free-living accelerometer data.

Statistical Methods in Medical Research (2020). https://doi.org/10.1177/0962280220922259[34] Max Wintermark, Musa Sesay, Emmanuel Barbier, Katalin BorbÃČÂľly, William P. Dillon, James D. Eastwood, Thomas C. Glenn,CÃČÂľcile B. Grandin, Salvador Pedraza, Jean-FranÃČÂğois Soustiel, Tadashi Nariai, Greg Zaharchuk, Jean-Marie CaillÃČÂľ, VincentDousset, and Howard Yonas. 2005. Comparative Overview of Brain Perfusion Imaging Techniques.

Stroke

36, 9 (2005), e83–e99.https://doi.org/10.1161/01.str.0000177884.72657.8b[35] Bing Zhai, Ignacio Perez-Pozuelo, Emma A. D. Clifton, Joao Palotti, and Yu Guan. 2020. Making Sense of Sleep: Multimodal Sleep StageClassification in a Large, Diverse Population Using Movement and Cardiac Sensing.

Proc. ACM Interact. Mob. Wearable UbiquitousTechnol.

4, 2, Article 67 (June 2020), 33 pages. https://doi.org/10.1145/3397325

APPENDIX5.1 Discrete wavelet transform and discrete wavelet packet transform

The

DWT procedure includes two parts: decomposition and reconstruction. Decomposition part will be the mainfocus in this project. We now consider more details of the

DWT using matrix algebra: W = W X , (3)where W is the output of matrix of DWT coefficients in different scales. W is the orthonormal matrix containingdifferent orthonormal wavelet bases (more details can be checked in [9] and [12]) and it satisfies W T W = I n . X is the raw signal. The signal X with length N = J , the N × N orthonormal matrix W can be separated into J+1submatrices, each of which can produce a partitioning of the vector W of DWT coefficients in each scale j, j =1,2,..., J. To be more specific, Eq(3) can be rewritten as follows: W X =  W W ... W J V J  X =  W X W X ... W J X V J X  =  W W ... W J V J  = W , (4)where W j is a column vector of length N / j representing the differences in adjacent weighted averages from scale1 to scale J, V J is the last column contained in W which has the same length with W J . W j is defined as detailedcoefficients at scale j. V J contains the approximated coefficients at the J-th level. W j has dimension N / j × N ,where j = 1,2,...,J and V J has the same dimension with W J . Note that the rows of design orthonormal matrix W , Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 19 depend on the decomposition level j-th. In other words, the value of J depends on the DWT decomposition scaleof the raw signal. The maximum decomposition level j equals J since our signal X has length N = J .We now further consider wavelet packet transform DWPT . The

DWPT is the expansion of the discretewavelet transformation. In

DWT , each scale is calculated by passing only the previous wavelet approximatedcoefficients through discrete-time low and high pass quadrature mirror filters. However, in the

DWPT , both thedetailed and approximation coefficients are decomposed to create the full binary tree. More details can be foundin [12].Finally, at 3 scales of decomposition as example, a very striking plot will be given in Fig.13, where the differencebetween

DWT and

DWPT will be seen very clearly.

Discrete Wavelet Packet Transform (DWT) at 3 scales of decomposition.

Fig. 13. Comparing with the

DWT and

DWPT at three scales decomposition.

In the discrete wavelet transform (

DWT ), W j represents DWT coefficients in the j-th decomposition scale.

DWT can be written as W = W X , where W is a column vector with length 2 j and W = [ W , W , ..., W J , V J ] T , W isthe orthonormal matrix which satisfies W T W = I n and contains different filters. Due to the orthonormality of DWT , which means that X = W T W and ∥ X ∥ = ∥ W ∥ , (cid:13)(cid:13) W j (cid:13)(cid:13) shows energy in the DWT coefficients withdecomposition level j. , Vol. 1, No. 1, Article . Publication date: September 2020.

Now the energy preserving condition can be written as: ∥ X ∥ = ∥ W ∥ = J (cid:213) j = (cid:13)(cid:13) W j (cid:13)(cid:13) + (cid:13)(cid:13) V J (cid:13)(cid:13) , (5)where X is our VM data (the signal vector magnitude of accelerometer data; see Sec.3.2) with length N, j = , , ..., J is the discrete wavelet transform decomposition level. W j denotes the detailed coefficient in scale j, and is a vectorof length N / j representing the differences in adjacent weighted averages from scale 1 to scale J. V J denotesthe approximated coefficients in the Jth level and has the same length as W J . Based on the decomposition, each (cid:13)(cid:13) W j (cid:13)(cid:13) represents a special part of the energy in our VM data which relates to the certain frequency domain [26][12].Then the sample variance from [12] can be decomposed as: (cid:98) σ X = N ∥ W ∥ − X = J (cid:213) j = (cid:13)(cid:13) W j (cid:13)(cid:13) N . (6)The term ∥ W j ∥ N represents the sample variance (corresponding to j at different scales of DWT decomposition)in our VM data X .There are many wavelet features (e.g., [26]) for the classification of dynamic activities from accelerometer datausing DWT . On this basis, we extract the features from the energy preserving condition and sample variancementioned previously.We aim to look for the features which imply the recovery level among the stroke patients (see Sec.3.3). Now,we define the features in the j-th level discrete wavelet transform and discrete wavelet packet transform:

SSD j = (cid:13)(cid:13) W j (cid:13)(cid:13) N / j = j (cid:13)(cid:13) W j (cid:13)(cid:13) N . For the detailed coefficients W j at decomposition level j, (cid:13)(cid:13) W j (cid:13)(cid:13) presents its energy and the raw data with lengthN. Hence the physical explanation of SSD j is that it stands for the point energy at the decomposition level j.Moreover, from the Eq(6), ∥ W j ∥ N represents the sample variance at the decomposition level j, SSD j also hasproperties of both the energy preserving condition and the sample variance in wavelet analysis with constant 2 j .Comparing with SSD j (sum of Square value of DWT coefficients at scale j (with normalisation)), we defineother features call SAD j , which is sum of Absolute value of DWT coefficients at scale j (with normalisation): SAD j = (cid:13)(cid:13) W j (cid:13)(cid:13) N / j = j (cid:13)(cid:13) W j (cid:13)(cid:13) N . After we check the correlation between the important wavelet feature

PNP ( Sec.3.3) and CAHAI score, thebranch of features

PNP using

SAD based perform better than those using

SSD based in Table 7. Hence we considerthe commonly used feature

SAD j in this paper. , Vol. 1, No. 1, Article . Publication date: September 2020. utomated Stroke Rehabilitation Assessment using Wearable Accelerometers in Free-Living Environments • 21 - Acute Patients Chronic PatientsScale (k) PN P k ( SSD ) PN P k ( SSD ) PN P k ( SAD ) PN P k ( SAD ) PN P k ( SSD ) PN P k ( SSD ) PN P k ( SAD ) PN P k ( SAD ) k=1.1 0.60 -0.65 0.68 -0.70 0.45 -0.45 0.56 -0.56k=1.2 0.60 -0.66 0.69 -0.71 0.46 -0.45 0.57 -0.56k=1.3 0.63 -0.69 0.70 -0.72 0.49 -0.48 0.58 -0.57k=1.4 0.62 -0.68 0.69 -0.71 0.47 -0.47 0.57 -0.57k=2 0.65 -0.69 0.69 -0.71 0.45 -0.45 0.56 -0.55k=3 0.63 -0.67 0.67 -0.68 0.39 -0.38 0.53 -0.52k=4 0.59 -0.63 0.60 -0.63 0.31 -0.30 0.48 -0.47k=5 0.46 -0.50 0.49 -0.52 0.29 -0.27 0.43 -0.42k=6 0.32 -0.38 0.35 -0.38 0.20 -0.16 0.35 -0.34k=7 0.16 -0.19 0.19 -0.20 0.13 -0.10 0.25 -0.24 Table 7. The correlation between SAD and SSD based wavelet features and CAHAI score for acute and chronic patients .

In our analysis, we assume the discrete wavelet decomposition level J = Table 8. The frequency domain from scale 1 to scale 7 by using DWT.

So far, we have decomposed the VM data X to get W , W , ... , W using DWT . Since the frequency domainat scale 1 is so wide (0.50hz - 1hz), it is better to divide it into smaller one, then using

DWPT in Appendix 5.1, wecan further decompose W into W . , W . , W . and W . which are the results of the 3-rd stage of DWPT ,each coefficient vector with length N / has the same dimension as the coefficients in the third level of DWT decomposition, that is ∥ X ∥ = ∥ W ∥ = ∥ W . ∥ + ∥ W . ∥ + ∥ W . ∥ + ∥ W . ∥ + J (cid:213) j = (cid:13)(cid:13) W j (cid:13)(cid:13) + (cid:13)(cid:13) V J (cid:13)(cid:13) . Now we have coefficients at 10 decomposition scales by using

DWT and

DWPT : W . , W . , W . , W . , W , W , W , W , W and W . Based on these detailed coefficients, we define the commonly used wavelet featuresagain: , Vol. 1, No. 1, Article . Publication date: September 2020. Scale 1.1 : SAD . = ∥ W . ∥ N / = ∥ W . ∥ N , Scale 1.2 : SAD . = ∥ W . ∥ N / = ∥ W . ∥ N , Scale 1.3 : SAD . = ∥ W . ∥ N / = ∥ W . ∥ N , Scale 1.4 : SAD . = ∥ W . ∥ N / = ∥ W . ∥ N , Scale j : SAD j = (cid:13)(cid:13) W j (cid:13)(cid:13) N / j = j (cid:13)(cid:13) W j (cid:13)(cid:13) N , j = , , , , , . There are 10 features which provide reliable and valid information (corresponding to more frequency domains)from different frequency domains. The frequency domain of these features, among 10 scales, is listed in Table 9:Scale 1.1 Scale 1.2 Scale 1.3Frequency 0.5hz - 0.625hz 0.625hz - 0.75hz 0.75hz - 0.875hzScale 1.4 Scale 2 Scale 3Frequency 0.875hz - 1hz 0.25-0.50hz 0.125hz - 0.25hzScale 4 Scale 5 Scale 6Frequency 0.0625hz - 0.125hz 0.0312hz - 0.0625hz 0.0156hz - 0.0312hzScale 7Frequency 0.0078hz - 0.0156hz