[PDF] Network memory in the movement of hospital patients carrying drug-resistant bacteria

Abstract

Hospitals constitute highly interconnected systems that bring into contact an abundance of infectious pathogens and susceptible individuals, thus making infection outbreaks both common and challenging. In recent years, there has been a sharp incidence of antimicrobial-resistance amongst healthcare-associated infections, a situation now considered endemic in many countries. Here we present network-based analyses of a data set capturing the movement of patients harbouring drug-resistant bacteria across three large London hospitals. We show that there are substantial memory effects in the movement of hospital patients colonised with drug-resistant bacteria. Such memory effects break first-order Markovian transitive assumptions and substantially alter the conclusions from the analysis, specifically on node rankings and the evolution of diffusive processes. We capture variable length memory effects by constructing a lumped-state memory network, which we then use to identify overlapping communities of wards. We find that these communities of wards display a quasi-hierarchical structure at different levels of granularity which is consistent with different aspects of patient flows related to hospital locations and medical specialties.

Full PDF

NNetwork memory in the movement of hospital patientscarrying drug-resistant bacteria

Ashleigh C. Myall , , Robert L. Peach , Andrea Y. Weiße , , Frances Davies , , Siddharth Mookerjee ,Alison Holmes , and Mauricio Barahona Department of Mathematics, Imperial College London, London, UK. Department of Infectious Disease, Imperial College London, London, UK. Current address: School of Informatics, University of Edinburgh, Scotland, UK. Imperial College Healthcare NHS Trust, London, UK.

October 8, 2020

Abstract

Hospitals constitute highly interconnected systems that bring into contact an abundance of infectious pathogensand susceptible individuals, thus making infection outbreaks both common and challenging. In recent years,there has been a sharp incidence of antimicrobial-resistance amongst healthcare-associated infections, a situationnow considered endemic in many countries. Here we present network-based analyses of a data set capturing themovement of patients harbouring drug-resistant bacteria across three large London hospitals. We show that thereare substantial memory eﬀects in the movement of hospital patients colonised with drug-resistant bacteria. Suchmemory eﬀects break ﬁrst-order Markovian transitive assumptions and substantially alter the conclusions fromthe analysis, speciﬁcally on node rankings and the evolution of diﬀusive processes. We capture variable lengthmemory eﬀects by constructing a lumped-state memory network, which we then use to identify overlappingcommunities of wards. We ﬁnd that these communities of wards display a quasi-hierarchical structure atdiﬀerent levels of granularity which is consistent with diﬀerent aspects of patient ﬂows related to hospitallocations and medical specialties.

Introduction

Antimicrobial resistance (AMR) poses one of thegreatest risks to human health [1]. Currently, around700,000 people worldwide die from infections with re-sistant pathogens each year, and this number is esti-mated to rise to up to 10 Million by 2050 [2, 3]. Hos-pitals and other healthcare facilities act as key vectorsfor the spread of AMR through healthcare-associatedinfections (HAI) [4]. Persistent colonisation of hospitalpatients and the networked nature of hospital processesunderlying patient mobility will likely cause AMR toremain prevalent [5]. Several factors moreover exac-erbate the spread of AMR in healthcare facilities, in-cluding the selective pressures generated by increasedantimicrobial usage, and the large pool of vulnerablepatients, who are more susceptible to infections [6].The need for infection prevention and control (IPC)can therefore not be understated.Understanding the transmission dynamics of AMRpromises valuable insights to improve IPC strategies.Key to these measures will be the analysis of patientpathways capturing the movement of patients carryingAMR during their hospital stay. Like many real-worldsystems, healthcare facilities have complex structure,which when ignored can limit the insights into the un-derlying dynamic processes. In this study we focus onmapping the movement pathways of patients knownto carry antimicrobial-resistant bacteria onto physicalstructures of the hospitals. Speciﬁcally, we focus on pa-tients colonised with Carbapenemase-producing Enter-obacteriaceae (CPE). CPE is a particularly concerningform of AMR that confers resistance to carbapenems, broad-spectrum antibacterials often used as last-lineantibiotics. CPE infections have recently seen a globalsurge amongst HAIs [7, 8].Networks provide an powerful formalism to analysethe movement of patients in hospitals. Nodes typicallyrepresent physical locations within the hospital, suchas wards, and edges represent the ﬂow of patients be-tween these locations, with edge weights encoding thevolume of patient ﬂow from one location to another.To facilitate analysis, we can aggregate the movementsof individual patients into probabilities of transitioningbetween hospital wards [9, 10]. Typically, patient tra-jectories are broken down into individual transitionsbetween wards: ﬁrst, the number of transitions be-tween each ward is summed across all patients and sub-sequently, for each ward the sum of all out-going tran-sitions is normalised to one. The constructed networkmay then be interpreted as a ﬁrst-order Markov model,where a random walker transitions with a probabilityproportional to the observed outﬂow volume from thecurrent node to others in the network [11].This dynamical assumption, whilst useful becauseof its simplicity and ease of implementation, is how-ever limited by the assumption that transitions be-tween nodes are independent of prior nodes within thepatient pathway. Previous studies have indeed shownthat ﬁrst-order Markovian dynamics are not suﬃcientto fully model network dynamics of disease propaga-tion [12, 13]. Akin to shipping trajectories or passen-ger movement between airports, patient movement inhospitals tends to follow particular patterns dictatedby medical or operational constraints. In particular, it a r X i v : . [ phy s i c s . s o c - ph ] O c t s plausible that patient trajectories could bear ‘mem-ory’, that is, a subsequent move depends on severalor all previous locations visited, and not solely on thecurrent location leading to transitive dependence in thedata.Introduced by Shannon [14], higher-order memorymodels have shown relevance across a number of appli-cations, and a wide range of real world movement data[15, 16, 17, 18, 19, 20] including several epidemiologi-cal data sets [21, 22]. Ignoring such transitive depen-dencies and modelling patient movement via memory-less, ﬁrst-order Markov models can distort both net-work topology and conclusions on the underlying pro-cess [23]. Despite the clear importance of transitivedependence, to date we only ﬁnd one study [24] of hos-pital patient movement accounting for these relation-ships, and none when looking at AMR across health-care facilities. Hence, in this study we investigate ev-idence for and implications of transitive dependencies in the movement patterns of hospital patients colonisedwith a CPE by including memory in our network mod-els. A - Patient Pathways B - First-order Network C - Memory Network Physical NetworkState Network

Figure (1) Illustration of transitive dependence en-coded into memory network. (A) Two sets of typi-cal patient pathways, largely independent, but pass-ing through the same ward as an intermediate pointin their pathways. (B) First-order representation ofA without any memory (C) Memory network repre-sentation A, whereby a physical node network mapsto state nodes, which encode transitive dependence ofthe patient pathways and constrain a random walkersmovement.To model these eﬀects, we use memory networks ,which encode the memory of individual trajectoriesinto higher-order transitive relationships, and whichhave successfully been used to investigate transitivedependence in pathway data [25]. To provide someintuition behind memory networks, consider a simpleexample of a small network of a hospital with ﬁvewards where the patients can follow one of two possi- ble routes between the wards, and the two routes shareone common node (Figure 1A). A ﬁrst-order (memory-less) network model assuming full transitivity (Figure1B) would wrongly suggest that a patient starting at v could transition to v with some probability, whenin fact, only patients starting at v can reach v . Ina memory network (Figure 1C) these transitive depen-dencies are captured by abstracting away from a net-work of physical nodes to a higher-order networks ofstate nodes representing the possible dynamical statesof the system (i.e., the sequence of hospital wards vis-ited up to a given memory) [26]. Speciﬁcally, in a mem-ory network each state represents a pathway of length k −

1, whereby higher-order state networks increasethe length k of the pathways captured by each state.This state network can be thought of as an additionallayer of information still bound to the physical networksince each state node is assigned to a physical node.The state network thus acts to constrain how a ran-dom walker transitions between physical nodes. Thesehigher-order network abstractions lend themselves tolearning tasks that can pinpoint key properties under-lying the dynamical process. In the case of HAIs, thiscan oﬀer insight into more accurate patterns in themovement of infected patients otherwise lost in a net-work model that assumes full transitivity.Below we present the analysis of patients pathwaysconﬁrmed to be colonised with CPE. We begin bypresenting our data and a description of the hospi-tal network. We then present evidence for memorywithin patient pathways by contrasting models con-structed with and without memory. Finally, we con-struct a lumped state memory network, which cap-tures transitive dependence and removes redundancy.We carry out multiscale community detection on thisnetwork, and present the resultant communities, high-lighting speciﬁc wards and specialities that are impor-tant across diﬀerent regions of the network. Results

Data

Our analysis is based on anonymised electronichealth records of patients from a large 1000-bed Trustof London teaching hospitals. Speciﬁcally, we usedward-level movement patterns of 967 patients whotested positive for CPE over a period of two years be-tween 2018 and 2020. We focused on the subset of 526patients who moved between at least two wards dur-ing their hospital stay for a total of 1958 transitionsbetween 96 hospital wards.Formally, the hospital Trust is structured around 17 specialities and 19 buildings , the latter belonging tothree hospital sites (Figure 2). Hospital site 3 acts asa Tertiary site with only speciality wards. Whilst sitesand buildings are constrained by geographical factors,specialities are deﬁned by medical procedures and thusmay overlap across sites and buildings. In fact, a num-ber of specialities span all ﬁve hospital sites (CriticalCare, Elderly Care, Medicine, Private, and Surgery).Geographical structures constrain patient movement to mergency MedicineSurgerySite 3Site 1Site 2 CancerHaematologyRenalCardiologyEndoscopyObstetricsGynaecologyMedicineElderly carePaediatricPrivateCritical CareRespiratoryNeurologyOncology

BuidlingHospital Ward Speciality

Figure (2) Sankey diagram of Trust Structure traversed by CPE patients. Broken down by hospital site, andbuildings to wards, then also broken down by speciality into wards.some extent: patients with certain co-morbidities andtherapeutic requirements are likely to be constrained toa single or several specialities supporting those needs,whereas other patients can move within buildings, orbetween wards placed closely for logistics and ease oftransfer.

From Patient Pathways To NetworkModels

We consider the trajectories of p patients. Eachpatient pathway as a trajectory T α and the set of α = 1 , . . . , p trajectories is T = (cid:8) T , T , T , ..., T p (cid:9) .Each T α consists of a time-ordered set representing dis-crete movements between nodes, T α = (cid:8) v i → v j → ... → v k (cid:9) , (1)where each node refers to one of N hospital wards N = (cid:8) ν , ν , ν , ..., ν N (cid:9) . Since these nodes representphysical locations, we will refer to them as physicalnodes to avoid confusion with state nodes , which weintroduce next.In order to understand the aggregate dynamics ofall patients, whilst preserving memory eﬀects in T , werepresent the trajectories as a memory network as pro-posed by Rosvall et al. [27]. This way, we maintaininformation about physical nodes N whilst instillingtransitive dependence in the connectivity patterns ofan underlying state-network, M k = ( E k , S k ). Here E k is the set of edges that link the set of state nodes S k that capture higher-order memory of order k [26].A memory network of order k = 1, M , representsa system with zero memory, where the movement of arandom walker only depends on its current location. In this special case, the state network M is equivalent toan aggregated physical network G = ( E , N ), and theset of states directly maps to the set of physical nodes,i.e., S = N . The edge weights w ij conforming the set E in M represent the frequency of transitions betweenphysical nodes ν i and ν j across the set of trajectories T . Given w ij , we can write the transition probabilitymatrix P for M as p ij = P ( i → j ) = w ij (cid:80) j w ij . (2)In memory networks of higher-order, where k > k −

1, andare no longer equivalent to the physical nodes S k (cid:54) = N .This representation allows us to introduce the memorydependence in T , capturing multi-step patterns of ﬂowvia the state nodes of the network [11].In particular, for the second-order memory network M , a state node represents a directed pathway oflength one s j = −→ ij . For two states nodes s j = −→ ij and s (cid:96) = −→ j(cid:96) to be connected, a path of length two,( ν i → ν j → ν (cid:96) ), must occur in the set of trajectories T . Similarly for higher-order models, edges betweenstate nodes are weighted w s j s l and capture the numberof occurrences that a transition between state nodes s j and s l was observed in T . Transition probabilities P k of M k for any order can be derived from Equation 3by altering the state node set S to represent pathwaysof length k −

1, so p s i s j = P ( s i → s j ) = s ij (cid:80) j s ij . (3)Each state node can be mapped to a physical node(Figure 1A), using an |S k | × N indicator matrix D ,he elements of which, D sν ∈ (cid:8) , (cid:9) , indicate the ﬁnalphysical node of a pathway s .We ﬁrst constructed a ﬁrst-order memory network M that contains 96 state nodes with a one-to-onemapping to the 96 wards (physical nodes). M consistsof four weakly connected components, one of whichcontains the majority of state nodes (87 out of 96) (Ad-ditional ﬁle 1 ). We next constructed a second-ordermemory network M that contains 384 state nodes, in18 weakly connected components. Similarly, M con-sists of a single weakly component that contains themajority of state nodes (329 of 384). Structurally, M has a higher connectivity with a clustering coeﬃcient of0.287 and a diameter of 6, whereas M is more sparsewith a clustering coeﬃcient of 0.003 and a larger diam-eter of 31, resembling a series of connected line graphs(Additional ﬁle 1). Patient trajectories break ﬁrst-order dy-namics

Using random walks to reveal and probe the struc-ture of networks has long been a foundational tool innetwork science [28]. A random walk is a stochasticprocess which consists of a succession of random stepswith no memory of its past locations; however, in asystem where transitive dependence plays a importantrole, a purely random walk becomes inaccurate andpotentially misleading. Memory networks of higher-order with k > M , to a second-order memory network M [27]. However, pathways exhibiting transitive depen-dence will constrain a random walker comparativelymore in the second-order memory network. Here, weuse the entropy rate of the random-walk to measure theuncertainty of moving between two state nodes [14]: H ( S t +1 | S t ) = (cid:88) i,j π ( i ) p ( i → j ) log p ( i → j ) , (4)where π denotes the stationary distribution across M ,and p ( i → j ) are the transition probabilities.We constructed the memory networks M k for k =1 , , , M k is detailed in Fig-ure 3A). Computing the entropy for each M k we ﬁndincreasing restriction of the random walk (reduced en-tropy) for larger k (Figure 3B). In particular, we ob-serve a large decrease in entropy from 2.70 to 0.57 whenwe move from k = 1 to k = 2. Patient pathways withlittle to no memory eﬀect would not exhibit any largechange in entropy when moving from M to M andthus our results suggest that there exist patient path-ways which break ﬁrst-order Markovian transitive as-sumptions and highlight the importance of capturingmemory.Now we must determine the optimal order k for agiven analysis. For small data sets, it is diﬃcult to statistically validate whether memory networks withhigher-order are relevant, given that the parameterspace and complexity increases exponentially [29]. Acommon workaround is to use cross-validation, a modelvalidation technique borrowed from machine learning[18]. In cross-validation, data is partitioned and per-formance is determined as an average across partitionsto reduce over-ﬁtting and selection bias [30, 31]. Toperform cross-validation in the framework of a mem-ory network we compute the rank orders of wards usinga training set of patient pathways and then comparewith the rank order of wards generated from visitationprobabilities of a withheld partition of patient path-ways. Similar to Rosvall et al. [27], we used a gen-eralised PageRank for higher-order models where thevisitation probabilities of state nodes were summed foreach physical node (see methodology). The rank or-ders between train and test sets were compared withKendall-Tau rank correlation [32] and the results wereaveraged over a 5-fold cross-validation split. We foundthat M was more predictive of the node ranking ofphysical nodes than M (0.60 to 0.49) (Figure 3B).This increased performance in M again suggests thata patient’s current and previous location both aﬀect fu-ture movement, and that accounting for this memoryeﬀect yields more accurate approximations of patientmovement.Whilst further higher-order memory eﬀects may ex-ist, we were unable to detect any increased predic-tive power beyond k = 2 (Figure 3B). We note thatthis may be due to limitations of our data; as we in-crease the order k , we must discard additional patientpathways with fewer than k transitions between wards.This is evident in Figure 3A that shows a decreasingnumber of pathways as we increase the k ; and whilstthe number of state nodes and edges initially increasesfrom k = 1 to k = 2, as you may expect by increas-ing model complexity, due to the decreasing numberof pathways we instead observe a decrease in the num-ber of state nodes beyond k >

2. Herein, to retainenough patient pathways for reliable insights, we thusshall focus on the k = 2 memory network.We then compared the PageRank of physical nodes(wards) between M and M (Figure 3C). Whilst wefound the PageRank of wards in M and M to becorrelated (0.81 (pval < M , indicating that CPE patients frequentlyvisit these wards. Given that patients undergoing renaltherapies are particularly at an increased risk for CPEacquisition within this hospital group [33], it is no sur-prise we ﬁnd these wards with a high PageRank in both M and M . However, the higher ranking of these Re-nal wards in M highlights the importance of using aconstrained state node network to understand the clin-ical movement of these patients. Conversely, Medicine13 was the highest ranked ward in M , but was foundto have a relatively lower rank in M . Medicine 13 isan acute medical admissions unit, and as such acts asthe entry/re-entry point for many patients to the hos-pital, rather than a transition ward or a ward which edicine 1 Medicine 14Elderly Care 3 Medicine 13Renal 2Respiratory 1Medicine 8Renal 3Renal 1Elderly Care 2 Cardiac 2Medicine 20 Paediatric 1Paediatric 4Paediatric 5Medicine 21 M1 Pagerank M P age r an k C k V a l ue State Nodes A V a l ue EntropyRank Correlation B k Figure (3) A comparison of the eﬀect of memory for models of order k . (A) Details of network size (no. statenodes, edges) and data (no. trajectories) for k = 1 , , ,

4. (B) The entropy of each higher order memorynetwork and the rank correlation between train and validation sets. (C) PageRank of wards for M and M ,dashed line indicating equality of PageRank.oﬀers care, and whilst it plays a starting role in manypatient pathways, it is seldom observed at any otherpoint in a patients trajectory through the hospital. Investigating memory eﬀects with a discrete dif-fusive process

One way we can study the eﬀect of memory isthrough the direct observation of its inﬂuence on adiﬀusion process starting at various points in the net-work [25]. Figure 4 A&B displays the evolution of adiscrete-time diﬀusive process for M and M , eachencoded by their respective transition matrix P k , wheninjecting an impulse at a single ward (Medicine 13). Attime t = 0, the diﬀusive process is entirely containedwithin the state node(s) corresponding to Medicine 13(for k >

1, where physical nodes can have several statenodes, we share initial probability over states based onthe frequency of pathways in T that s j = −→ ij repre-sented). For times t > t through powers ofthe transition matrix P tk .After a single discrete step t = 1 we ﬁnd there islittle eﬀect of memory with the total number of wardsreachable being similar for M and M (12 wards vs9 wards, respectively). However, as we extend the dif-fusive process to t = 2 and t = 3 we ﬁnd that thenumber of reachable wards from Medicine 13 increasesrapidly for M (36 wards at t = 2, then 71 wardsat t = 3) whereas we do not see any change in M (9 wards at t = 2, and 9 wards at t = 3). In fact, for M a random walk initialised at Medicine 13 can reach71 out of the 79 wards within the largest weakly con-nected component in T after just 3 steps. This levelof transitivity is not present in T , and its absence isdirectly observable by looking at the restriction of ﬂowevident in M (Figure 4 A&B). This diﬀerence comes from patients not starting at Medicine 13, but pass-ing through its neighbours inﬂuencing the 2-step net-work transitivity. Interestingly, M constrained walk-ers such that no backtracking to Medicine 13 is pos-sible over the ﬁrst 3 discrete transitions, in contrastto M , where backtracking to Medicine 13 is possi-ble for t >

1. In fact, using M there is a relativelylarge probability to revisit Medicine 13 after 2 or 3steps ( p med = 0 .

18 and p med = 0 . M better captures true patient ﬂow. Forward reachability is varyingly constrainedby memory

We expanded the above framework to examine reach-ability across the entire network by performing theanalysis for every possible starting node. For eachward, we compute the set of reachable wards after t time-steps and in Figure 4C we display the median sizeof reachable sets for all wards under M and M . Sim-ilar to the analysis of Medicine 13 in Figure 4 A&B, weﬁnd that the median size of reachable sets is relativelysimilar between M and M at t = 1. However, as t increases we again observe divergence in the reachableset sizes due to the signiﬁcantly larger set of reachablewards in the ﬁrst-order model M . Indeed, after 3time-steps only 5 wards are reachable on average un-der M as compared to the 79 reachable wards under M . Hence M is inﬂating transitivity between wardsand distorting the set of reachable wards for a patientthrough inherent ignorance of prior ward visits. Wealso observe that the variance of the reachable set of AB Oncology Private Renal Respiratory SurgeryEndoscopy Gynaecology Haematology Medicine NeurologyCancer Cardiology Critical Care Elderly Care Emergency D R ea c hab l e S e t S i z e Discrete time steps R ea c hab l e S e t S i z e Model

Medicine 13 Medicine 13Medicine 13 Medicine 13

Figure (4) Memory eﬀects on network pathways. (A&B) A discrete random walk starting at Medicine 13 inthe state networks of M and M , modelled as a discrete time diﬀusion over the transition matrices P and P respectively. After a single discrete time step t = 1 the reachable number of wards is similar between M and M , however, they quickly diverge as the diﬀusion evolves over time and the transitive eﬀects increase in M .(C&D) Median size of the reachable set of wards for M and M : (C) Overall reachability after t -discrete steps,and (D) reachability after t -discrete steps broken down by speciality (for in the largest connected component).wards for M increases for t = 2 ,

3, suggesting thatthe importance of memory is diﬀerent depending onthe ward at which the diﬀusive process was initialised.To study this, we next break down wards by spe-ciality and examine the importance of memory on themedian set size of reachable wards. Figure 4D sum-marises the size of reachable sets averaged across wardswithin the same speciality. We notice that speciali-ties which are known to be well visited by CPE pa-tients in this hospital setting (e.g., Critical Care, Re-nal) exhibit a comparatively larger reachability set sizewhen compared to the aggregated view in Figure 4C.In contrast, specialities such as Neurology or Cancerwhich are less common to CPE patients exhibit a rel-atively lower reachability. These diﬀerent reachabil-ities between specialities could be the consequence oftwo mechanisms: (1) the diﬀerent roles specialities playwithin the network and their transitivity by CPE pa-tient trajectories, and (2), that memory eﬀects mayvary in diﬀerent areas of the network, i.e. the extentto which a previous ward determines a patients nextmove. Hence, it may be optimal to construct a ‘hy-brid’ of M and M which incorporates many of thedesirable memory eﬀects in M , but simpliﬁes parts ofthe model where greater transitivity is in fact present. Reducing complexity using state nodelumping

Given a large set of trajectories, the problem arisesthat state node networks M k can become increasinglylarge and often duplicate or contain redundant infor-mation. In the case of patient trajectories, not all hos-pital pathways may exhibit memory eﬀects in equalmeasure. Variable-length Markov models, pioneeredby Rissanen [34] alleviate some of these issues by in-troducing a ‘lumping’ step in which ‘redundant’ statesare merged, thus enabling models to capture vari-able lengths of memory and remove model redundancy[35, 36]. Remembering that in memory networks, statenodes are assigned to physical nodes, we will often ﬁndseveral state nodes that are connecting the same phys-ical nodes just via diﬀerent edges. There is no need forthis repetition and therefore here we focused on lump-ing state nodes within the same physical node to formso called ‘meta-state nodes’ or ‘lumped nodes’ whichalso beneﬁt from preserving the physical network struc-ture [25]. For each lumped node, we reassemble all con-nections between two states nodes such that weightingand connectivity are preserved [26]. In eﬀect, ‘lumping’nodes retains relevant and distinct patterns of transi-tive dependence in the original pathways; however, forour purposes it also serves to ’de-sparsify’ M , improv-ing its practicality and making it useful for subsequentlearning tasks that assume greater connectivity.n our approach, we lump together state nodes basedon the similarity of visitation probabilities computedfrom a discrete diﬀusive process encoded in the statetransition matrix P k over t -steps. Existing node lump-ing methodologies use a 1-step random walk to identifystate nodes that have similar connectivity within thenetwork[26, 37]. Here, however, we extend this ap-proach to t -steps to identify similarity across a greaternetwork locality. Using an agglomerative clusteringmethod on the discrete diﬀusive process, we can thenidentify state nodes with similar connectivity, and ifboth are members of the same physical node they canbe lumped together[38] (for a detailed explanation seemethods).To what extent should we lump state nodes to-gether? At one extreme, we have the state node net-work M k without any lumping and at the other ex-treme we have the physical node network where everystate node has been lumped together within its respec-tive physical node. We want to identify an optimallumping, comfortably between the two extremes, thatretains transitive dependence but removes redundantor duplicated information. The resulting lumped net-work is denoted ˆ M k . In order to quantitatively de-termine the optimal lumping, we used ‘ground-truth’community structures such as buildings, specialities,and hospital sites and compared these annotations withthe results of community detection on the lumped net-work ˆ M k . Whilst these structures do not fully con-strain patient movement and therefore cannot providean exact ground truth, there does exist a correlationwith patient movement. We hypothesised that the op-timal lumping would be found at the elbow of a ﬁtnesscurve generated from the ability to detect known hos-pital structures in community structures, thus provid-ing a trade-oﬀ between model accuracy and simplicity.Accordingly, we found that a lumping rate of r = 0 . Complexity

Figure (5) From ﬁrst-order network to second-ordernetwork and everything between. The ﬁrst order net-work M (left), the lumped second-order network ˆ M (middle), and the second-order state node network M (right) ordered by scale of model complexity.The lumped network ˆ M contains 171 state nodesacross 7 weakly connected components. Similar to thestate node networks M and M , we found a largeweakly connected component that contained the ma-jority of state nodes (156 out of 171) (Figure 5). Asidefrom visually appearing to exist in a state between M and M , both its clustering coeﬃcient (0.054) and net-work diameter (11) sat comfortably between M and M , serving to validate its balance of complexity, con-nectivity, and higher-order dependencies. Note thatunlike M , the lumped state network ˆ M no longerresembles a series of lines graphs, and hence provides amore practical structure over which to apply commu-nity detection. Community detection reveals overlap-ping clusters of wards common to dis-tinct pathways

By constraining a walkers movement within the con-nectivity patterns of M k , for k >

1, we can identifycommunities within M k that conserve ﬂow from a dy-namical perspective. Given that M k is composed ofstate nodes, the memory-dependent structure C willprovide network partitions that shed light into commu-nity structure. Here we use Markov Stability (MS), aquasi-hierarchical community detection algorithm thatidentiﬁes regions within a network in which a diﬀu-sive process becomes transiently constrained [39]. MSexploits diﬀusion dynamics over an underlying graphstructure to reveal multi-scale community organisationand their stability across time scales (see methods fora more detailed introduction). The quasi-hierarchical community structure ofthe wards

Continuing with the lumped state network ˆ M , weapply MS and in Figure 6 we show an apparent hierar-chy of state node assignments to community partitionsacross Markov time t . We selected three points acrossMarkov time ( t , t , t ) that exhibited robust commu-nity partitions(Additional ﬁle 5). At longer time scalesMS reveals coarser community partitions which showsigniﬁcant correspondence to hospital sites (Figure 6).Speciﬁcally, at t each cluster in the 3-way partitionstrongly corresponds to one of the three hospital sites.If we extend to even longer t we identify a 2-way parti-tion where two hospitals are grouped almost exclusivelyinto a single community (Additional ﬁle 7). Notably,the hospital with wards grouped separately is the Ter-tiary site within the hospital trust which consists ofspeciality wards and appears to share fewer patientswith the other two hospitals.Moving towards shorter t within the MS analysis,which are expected to identify more granular structuresof patient ﬂow, we identity sub-structures largely con-tained within hospital sites, which overlap to a lesserextent between hospital sites. In some cases, theseconfer to buildings (we ﬁnd 10 buildings that are over-represented in clusters at t ), in other cases these conferto specialities (we ﬁnd 7 specialities over-represented inclusters at t ). Focusing initially on speciality, we ﬁndthree specialities (Haematology, Cardiology, and Re-nal) that are over-represented within separate commu-nities suggesting they are have a high degree of withinspeciality patient movement (Figure 6A). However, aswe increase t to reveal coarser partitions we see themore granular communities combine, bringing togetherpreviously distinct specialities such as Haematology orRenal into coarser partitions with other specialities,ighlighting the zooming aﬀect of MS as we changethe t at which communities are observed. However, itis clear that the community structure is not entirelydeﬁned by specialities and the physical constraints im-posed by buildings, hospitals, and common movementpatterns play a signiﬁcant role and result in our ob-served communities. (Figure 6B). Given that the ma-jority of patients will move between specialities at somepoint during their journey through the hospital, it isexpected that communities would not correspond ex-actly to ward specialities. A number of specialities willservice several diﬀerent groups of patients such as thosein Medicine, a general class of ward that often takes ad-missions, or Critical Care, which can service patientsfrom any given ward if they deteriorate fast enough.Notably, we ﬁnd that Wards both in the Medicine andCritical Care specialities can be found within 10 dif-ferent communities at t , additionally, Surgery anotherdepartment services multiple other wards, can be foundin 9 diﬀerent communities. Site 2Site 3Site 1 CancerCardiologyCritical CareElderly CareEmergencyMedicineEndoscopyGynaecologyHaematologyMedicineNeurologyOncologyPrivateRenalRespiratorySurgerySite 2Site 3Site 1 AB HosptialsHosptials SpecialitiesBuidlings

MS Communities

Figure (6) Hierarchical breakdown of Markov Stabil-ity communities for three chosen points in Markov time(optimal partitions in Markov time chosen for their ro-bustness after a more thorough Markov Stability analy-sis, see Additional ﬁle 4) and their relations to Hospitalsites for coarse partitions, and then their relations atgranular partitions to (A) Specialities and (B) Build-ings.

Overlapping community assignments

Community detection generally focuses on ﬁndingdisjoint communities, however, multiple communitymemberships is a well observed phenomena, whereby agiven node may have multiple functions that it shares with diﬀerent groups of nodes [40]. Understanding thatwe are essentially clustering wards based on the move-ment patterns of patients, it is likely that diﬀerent co-horts of CPE patients (e.g. with diﬀerent comorbidi-ties) have overlapping pathways. For instance, diﬀer-ent cohorts of patients still require a set of commonservices and hence visit an overlapping set of wards(e.g. for admission, surgery, critical care, or renal dial-ysis). This phenomenon is well captured by memorynetworks, standard methods of community detectionapplied across the state network are able to reveal over-lapping communities of nodes on the physical network.Additionally, the notation of granularity introducedby MS adds an interesting dimension to this problem,whereby the degree to which wards overlap communi-ties can depend on the point Markov time. We can thusidentify hospital wards which persistently overlap mul-tiple communities across both granular and coarse timescales. These wards are of particular interest when de-veloping Infection Prevention and Control strategies asthey can play the role of network bridges and potentialtransmission hotspots.At the most granular time scales, we ﬁnd 48 wardswith multiple community assignments (Additional ﬁle8). With increasing Markov time the total numberof overlapping wards decreases; however, there existseveral wards which are persistently overlap communi-ties. We ﬁnd 4 Renal wards and a single Elderly Careward which have membership within each communityof the 2-way coarse partition. Despite disappearing inthe very coarser 2-way partition after t >

12, CriticalCare, Medicine, and Surgery, as well as a single El-derly care ward also overlapped between communities.Since the most coarse partitions strongly correspondedto non-specialist hospital, and specialist hospital sites,it is likely that Critical Care and the Elderly care wardsstill play a strong connective role within connecting thetwo non-specialist hospital Sites 1&2.

Identifying the most central wards

In the previous section we identiﬁed nodes that wereassigned to multiple communities, highlighting theircritical role in the pathways of multiple cohorts of pa-tients with diﬀering patterns, and prior, we examinedthe PageRank of wards to identify their importance in M and M .For a more complete examination of ward impor-tance, and investigation into ˆ M , we use MultiscaleCentrality (MSC), that enables us to identify nodesthat are central at diﬀerent scales within the network[41]. Following the same approach to compute central-ity of the physical nodes, we compute MSC for eachstate node and then compute the sum of state nodecentrality across each physical node to generate a valueof MSC for each ward.Figure 7 shows the results of MSC computed for ˆ M .We ﬁnd several wards that are central at all scales,implying that they are both highly connected locally(short time scales), and also important as global con-nectors/bridges (long time scales). Both Medicine 13and 14 appear as central at all time-scales; Medicine 13nd 14 are both admission and readmission points intothe hospital, where patients will be ﬁrst identiﬁed aspositive for CPE, and where they will return if readmit-ted. Additionally, we ﬁnd 4 renal wards are central atall scales. Interestingly, we also ﬁnd wards which varyconsiderably in their importance across time-scales; El-derly Care 2 seems well connected locally, but at longertime scales its becomes comparatively less important. Medicine 14Medicine 13Renal 2Elderly Care 3Renal 1Renal 4Renal 3Critical Care 4Renal 5Medicine 1Medicine 8Respiratory 1Elderly Care 2Medicine 21Medicine 20 time scale M S C R an k -2 Figure (7) Multiscale centralities ranking of ˆ M across time. The several wards annotated are thosewith largely diﬀerent PageRanks in the comparison of M , M . At short Markov timescales, some nodes willhave not been assigned a Multiscale centrality valueand so will not yet be ranked. Conclusions

Analysing a large set of patient pathways, we showedthat the movement of hospitalised patients colonisedwith CPE displays substantial memory eﬀects. Thismeans that ward transitions depend on previously vis-ited wards. Memory eﬀects were evident from the dif-ference between the node rankings of diﬀerent ordermodels, as well as the statistics of a diﬀusion processon the resulting network models. Notably, memory in-creased the probabilities of visiting wards known to becommonly visited by CPE patients (e.g. Renal) anddecreased the probabilities of visiting wards less com-mon amongst CPE patients (e.g. Paediatric). Mem-ory also greatly aﬀected local reachability; for example,the memory-less ﬁrst-order model, wrongly implied al-most any ward could be reached from any other wardswithin three discrete time steps. Our work thus showedthat not accounting for pathway ‘memory’ can misleadboth the importance of hospital wards and knowledgeabout how patients move throughout healthcare net-works. These insights into the constraints of the move-ment of CPE patients can aid infection prevention andcontrol to prevent transmission to uncolononised pa-tients.Models with memory have substantially larger pa-rameter space. We therefore simpliﬁed the memorymodel by constructing a hybrid ’lumped’ memory net-work. The latter retains the eﬀect of distinct mem-ory present in the patient trajectories but removes re-dundant or duplicate information. In this context, we extended previous work on lumping in memory net-works in two ways: ﬁrstly, by deﬁning a state node fea-ture vector, which allowed state nodes to be comparedand lumped into meta nodes based on longer random-walks; secondly, we proposed that lumping could beoptimised by using prior knowledge with known com-munities which partially constrain patient pathways.The lumped memory network then formed the ba-sis of our subsequent investigation using communitydetection to reveal communities of movement withinour healthcare network. To this end, we used priorknowledge, including the hospital structure or special-ities with noisy signals, to optimise the rates of lump-ing based on Markov Stability. As a result, we canhighlight pathway clusters with higher-order memory,and identify wards that occur across multiple pathwaycommunities. Particularly, we found that communityoverlaps identiﬁed wards that are visited by virtuallyall CPE patients (Renal wards), or wards visited com-monly by the general patient population (Medicine,Surgery, and Critical Care wards). Notably, there maybe some ward selection bias here, due to the natureof the medical conditions of the patients who speciﬁ-cally attend the renal wards, which mean they have anincreased risk of CPE carriage; although this was notstudied for this analysis, connections by medical diag-noses could inform future work. The communities ofCPE patient movement we identiﬁed divided the hos-pital sites quasi-hierarchically into sub-communities ofwards that share patient ﬂow. There was some correla-tion between community structures and known struc-tures, such as hospital buildings or specialities, how-ever, communities likely result from common pathwaysspeciﬁc to certain cohorts of CPE patients amongst thishospital group.Our study highlights the role of memory in patientpathways. Most current analyses of patient pathwaysassume memoryless-ness. Here, however, we showedthat ignoring memory may wrongly identify potentialhubs of disease transmission. This in turn would mis-lead eﬀorts to prevent infection of the general patientpopulation. Our lumped memory networks thereforeprovide a framework for future patient-pathway anal-yses to improve containment of CPE and may as wellbe applied to inform infection prevention and controlof other HAIs.

Methods

Higher-order PageRank

PageRank is a measure of node importance orcentrality within a network based on the incomingedges [42]. To obtain

Higher-order

PageRank we fol-low the derivation presented by

Rosval et al. in [27].PageRank is essentially computing the visitation prob-abilities to nodes over a network, determined by con-nectivity and weighting of those connections. In thecontext of a memory network, one can simply de-rive PageRank over the underlying state network fora model of arbitrary order k , and project the visitationprobabilities back onto the physical nodes.irstly, we deﬁne the probability of ﬁnding a randomwalker on a given state node s at time t + 1 as P ( s j ; t + 1) = (cid:88) s i P ( s i ; t ) p ( s i → s j ) , (5)where as before a state confers to a pathway of length k and transition probabilities are encoded by the tran-sition matrix P .Now, for any order k the higher-order generlisation ofPageRank is simply the stationary solution to equation5: π ( s j ) = (cid:88) i π ( s i ) p ( s i → s j ) . (6)With π ( s j ) it is then trivial to return the physi-cal node PageRank by summing over a physical nodesstates: π ( k ) = (cid:88) j π ( s j ) = (cid:88) k π ( s j ) . (7) State lumping on local connectivity

Given a large set of trajectories, the problem arisesthat state node networks M k can become very largeand often contain redundancies. Not all pathways ex-hibit full transitive dependence, so it can often be de-sirable to reduce the model complexity by lumping to-gether redundant state nodes. Redundancy of statenodes can lead to over-ﬁtting when a physical nodecontains a number of similar states. Hence, we fo-cus on lumping states nodes within the same physi-cal node, forming so called ‘meta state nodes’ whichalso beneﬁt from preserving the physical network struc-ture [25]. For each lump, we reassemble all connectionsbetween two states nodes such that transition proba-bilities and connectivity are preserved [26]. In eﬀect,‘lumping’ state nodes together reduces the model com-plexity whilst retaining the transitive dependence ofthe original pathways.In our approach, we lump together state nodes basedon the similarity of visitation probabilities of the phys-ical nodes. To do this we use the S × S state transi-tion matrix P over k -steps and then sum the proba-bilities over the state nodes that compose each phys-ical node. In the construction of P we add weightedself loops equivalent to a nodes total outﬂow weight w s i s i = (cid:80) s i w s i s j to derive P (cid:48) with Equation 1. Thisself loop conserves local ﬂow across P (cid:48) , emphasising lo-cal connectivity when we subsequently determine dis-tances across X .We deﬁne the state node to physical node transitionmatrix X as the visitation probabilities of each statenode to each physical node over k -steps, X = P k D ,where P is the state node transition matrix and D isthe S × N state node to physical node indicator matrix.Each entry x ij corresponds to the probability of tran-sitioning from state node i to physical node j and thusprovides a mapping from the higher order state nodenetwork to the physical node network. Here, we set k = 3 to incorporate a slightly larger range of local con-nectivity than previous works that use k = 1 [26, 37]. State nodes with similar local connectivity will ex-hibit similar probability distributions on the physicalnode network, therefore we can compute a similaritymatrix between state nodes by computing the Wasser-stein distance [43] between vector rows of X whichmeasures the distance for moving from one probabil-ity distribution to another. The similarity matrix wassubsequently clustered using an agglomerative cluster-ing method for lumping state nodes within physicalnode [38].In order to control the lumping of state nodes weemployed a clustering rate r , which sets the numberof ﬁnal lumped state nodes that should be constructedfor each physical node after completion of the lumpingprocedure. For example, lets consider a scenario wherewe have two physical nodes, one of which is composedof 10 state nodes and the second is composed of 20 statenodes. If we set the lumping rate r = 0 .

2, then afterlumping the ﬁrst physical node would have 2 lumpednodes after the procedure whereas the second physi-cal node would have 4 lumped nodes. Increasing thelumping rate to r = 0 . ν )and its constituent state nodes (grey circles within thered dashed circle) for diﬀerent values of k . For the case k = 1 (see M (cid:48) in middle of Figure 8) only the nearestneighbours of each state node are considered and assuch s and s will be lumped together ﬁrst. The nextlumping of state nodes is unclear given that both s and s have 1-step neighbors states in diﬀerent physicalnode. However, as we increase k we explore more of thelocal network and at k = 2, in this example, it becomesclear that s is more similar to s and s . Hence forthe second lumping, s is merged with lumped metanode s , instead of s (see M (cid:48)(cid:48) in middle of Figure 8). Dynamical community detection

Dynamic community detection with Markovian as-sumptions can still be used to reveal structure in amemory network, simply by applying the same com-munity detection algorithms to the higher-order net-work structure. M k , for k >

1, acts to constrain awalkers movement over the physical nodes within itsstate network connectivity. Hence, if we look for re-gions across M k that conserve ﬂow from a dynamicalperspective, projecting the resultant communities backonto the physical nodes reveals overlapping communi-ties constrained by the transitivity of the state network.One such example for such a dynamical approachto community detection is Markov stability (MS) [39],which is the focus for this study. MS exploits diﬀu-sion dynamics over an underlying graph structure toreveal a multi-scale community organisation and hasbeen show to be eﬀective in a variety of applications inwhich multiple scales are expected to exist such as pro-tein sub-structures [44] or social behaviours [45]. Givenigure (8) State Lumping Example for a single physical node ( ν ). Two possible lumpings M (cid:48) and M (cid:48)(cid:48) arevisualised here over the state nodes (grey nodes) with the physical node mapped over (dotted circles surroundingstate nodes). Here each lump merges the two most similar state nodes based on feature vectors capturing localvisitation probabilities of k=2 network steps.a partition P of nodes into C non-overlapping commu-nities with a N × C community indicator matrix H P the time-dependent clustered autocovariance matrix inMS is given by, R ( t, H P ) = H T P (cid:2) Π( exp ( t [ M − I ]) − ππ T ) (cid:3) H P , (8)where the elements of the matrix [ R ( t, H P )] correspondto the probability of a random walker starting at node i and ending up in community c at Markov time t minusthe probability of that happening by chance.For an optimal partition P , in which ﬂow is trappedmore than one would expect by random over t , wewould expect a comparatively large Markov stabilityWith the Markov stability as r ( t, H P ) = trace R ( t, H P ) . (9)We aim to maximise r ( t, H P ) over the space of pos-sible partitions P at a given Markov time t , P max( t ) = argmax P r ( t, H P ) . (10)Whilst the optimisation of Equation 10 is NP-hard,in practice, heuristics algorithms have been developedwhich are computationally eﬃcient. Here we use theLouvain algorithm which has has been demonstratedto oﬀer robust solutions at reasonable cost [46]. Identifying partitions of interest over Markov-time

Given a set of partitions that are optimal at eachMarkov time we must still deﬁne which scales are rep-resentative or robust in respect to our system. In orderto identify partitions of interest over time we look to-wards two robustness measures. Firstly, we look atconsistency of partitions for single points in time, andsecondly, we look for stable partitions across time.To assess this consistency between P at Markov time t we can compute an information-theoretical distancebetween two alternate partitions P and P (cid:48) is employed: V I ( P i ( t ) , P j ( t )) = 2Ω( P , P (cid:48) ) − Ω( P ) − Ω( P (cid:48) )log( n ) , (11) where Ω( P ) is the Shannon entropy, P C being the rel-ative frequency of ﬁnding a node in community C inpartition P .Then to quantify consistency at Markov time t wecompute the average variation of information of all so-lutions: (cid:104) V ( t ) (cid:105) = 1 l − (cid:88) i (cid:54) = j V I ( P i ( t ) , P j ( t )) . (12)For the case that optimisations return near identicalpartitions (cid:104) V ( t ) (cid:105) will be small, which indicates robust-ness of the partition at t . Hence over t we search forpartitions with low (cid:104) V ( t ) (cid:105) .Relevant partitions should also be remain consistacross regions of Markov time. Such persistence is indi-cated both by a plateau in the number of communitiesover t and a low value or plateau of the cross-time vari-ation of information: V I ( t, t (cid:48) ) = V I ( (cid:98) P ( t ) , (cid:98) P ( t (cid:48) )) . (13) Multi-scale Centralities

For identiﬁcation of central nodes we use MultiscaleCentrality, that enables us to identify nodes that arecentral at diﬀerent scales within the network [41]. Mul-tiscale centrality leverages the presence of ‘overshoot-ing’ peaks that appear in diﬀusion processes on thegraphs. For a more detailed description of overshoot-ing peaks, see [47]. Central nodes are deﬁned as a node, i that breaks the triangle inequality for a pair of nodes j, k , ∆ ij,τ := t ∗ ij,τ + t ∗ ik,τ − t ∗ jk,τ ≤ , . where t ij,τ is the Markov time at which an overshoot-ing peak appears at node j given the diﬀusive processof an initial delta function at node i which is allowedto diﬀuse up to Markov time τ .The diﬀusion process underlying Multiscale central-ity acts as a scaling factor that allows us to identifynodes that are central at diﬀerent scales of the net-work structure. For example, some nodes may be lo-cally central (with high degree) or might be globallyentral (high closeness). Thus we produce a rankingof nodes as a function of Markov time τ of the diﬀu-sion process. For further details on this methodology,see [41].For each state node we can compute the Multi-scale centrality. In an identical manner to Higher-order PageRank (see Section Higher-order PageRank),we can then compute a physical node centrality bysumming the multiscale centrality over the constituentstate nodes. Abbreviations

AMR: Antimicrobial resistance; HAI: Healthcare-associated infection; IPC: Infection Prevention andControl; CPE: Carbapenemase-producing Enterobac-teriaceae; MS: Markov Stability; MSC: MultiscaleCen-trality.

Data collection and ethics

Patient pathway data was collected from the cen-tral business intelligence system and fully pseu-danonymised before analysis, in accordance with ethics15-LO-0746.

Availability of data and materials

The datasets generated and analysed during thecurrent study are not publicly available to protectanonymity of included hospital patients.The repository for Multiscale central-ity can be found at https://github.com/barahona-research-group/MultiscaleCentrality . Competing interests

The authors declare that they have no competinginterests.

Funding

AM was supported in part by a scholarship fromthe Medical Research Foundation National PhD Train-ing Programme in Antimicrobial Resistance Research(MRF-145-0004-TPG-AVISO), as well as by the Na-tional Institute for Health Research Academy. FD issupported by the Medical research council Clinical aca-demic research partnership scheme. Professor AH is aNational Institute for Health Research Senior Investi-gator, AH is also partly funded by the National Insti-tute for Health Research Health Protection ResearchUnit in Healthcare Associated Infections and Antimi-crobial Infections in partnership with Public HealthEngland, in collaboration with, Imperial HealthcarePartners, University of Cambridge and University ofWarwick. AM, RP, and MB also acknowledge fundingfrom EPSRC grant EP/N014529/1 to MB, support-ing the EPSRC Centre for Mathematics of PrecisionHealthcare. The views expressed in this publication are those ofthe author(s) and not necessarily those of the NHS,the National Institute for Health Research, the De-partment of Health and Social Care or Public HealthEngland.

Authors’ contributions

AM: performed empirical analysis of data, con-tributed to methodological developments, and to thewriting of the paper; RP: performed empirical analysis(MSC), contributed to methodological development ofstate lumping, and to the writing and revision of thepaper; AW: contributed to interpretation of results andrevision of paper; FD: contributed to interpretation ofdata and interpretation of results; SM: provided studydata and interpretation of data; AH: co-supervised theproject and contributed to interpretation of results;MB: provided main project supervision and revisionof paper. All authors have read and approved the ﬁnalpaper.

Acknowledgements

We wish to thank Eleonora Dyakova for help withaccessing data. We would also like to thank FrankieBolt, and Juliet Allibone for supporting the ﬁnal pub-lication process.

References [1] F. Prestinaci, P. Pezzotti, and A. Pantosti, “Antimicrobialresistance: a global multifaceted phenomenon,”

Pathogensand global health , vol. 109, no. 7, pp. 309–318, 2015.[2] Interagency Coordination Group on Antimicrobial Resis-tance, “No time to Wait: Securing the future from drug-resistant infections,” tech. rep., World Health Organisation,Apr. 2019.[3] J. O’Neill, “Tackling drug-resistant infections globally: ﬁ-nal report and recommendations,” tech. rep., Review onAntimicrobial Resistance, 05 2016.[4] M. J. Struelens, “The epidemiology of antimicrobial resis-tance in hospital acquired infections: problems and possiblesolutions,”

Bmj , vol. 317, no. 7159, pp. 652–654, 1998.[5] R. Pastor-Satorras and A. Vespignani, “Epidemic dynamicsand endemic states in complex networks,”

Phys. Rev. E ,vol. 63, p. 066117, May 2001.[6] W. H. Organization et al. , “Prevention of hospital-acquiredinfections: a practical guide,” tech. rep., Geneva, Switzer-land: World Health Organization, 2002.[7] R. A. Bonomo, E. M. Burd, J. Conly, B. M. Lim-bago, L. Poirel, J. A. Segre, and L. F. Westblade,“Carbapenemase-producing organisms: a global scourge,”

Clinical Infectious Diseases , vol. 66, no. 8, pp. 1290–1297,2018.[8] L. K. Logan and R. A. Weinstein, “The epidemiology ofcarbapenem-resistant enterobacteriaceae: the impact andevolution of a global menace,”

The Journal of infectiousdiseases , vol. 215, no. suppl 1, pp. S28–S36, 2017.[9] T. Donker, J. Wallinga, R. Slack, and H. Grundmann,“Hospital networks and the dispersal of hospital-acquiredpathogens by patient transfer,”

PLOS ONE , vol. 7, pp. 1–8, 04 2012.[10] D. M. Bean, C. Stringer, N. Beeknoo, J. Teo, and R. J. B.Dobson, “Network analysis of patient ﬂow in two uk acutecare hospitals identiﬁes key sub-networks for a&e perfor-mance,”

PLOS ONE , vol. 12, pp. 1–16, 10 2017.11] V. Salnikov, M. T. Schaub, and R. Lambiotte, “Usinghigher-order markov models to reveal ﬂow-based communi-ties in networks,”

Scientiﬁc reports , vol. 6, p. 23194, 2016.[12] R. M. May and A. L. Lloyd, “Infection dynamics on scale-free networks,”

Phys. Rev. E , vol. 64, p. 066112, Nov 2001.[13] R. Pastor-Satorras and A. Vespignani, “Epidemic spreadingin scale-free networks,”

Phys. Rev. Lett. , vol. 86, pp. 3200–3203, Apr 2001.[14] C. E. Shannon, “A mathematical theory of communica-tion,”

The Bell System Technical Journal , vol. 27, no. 3,pp. 379–423, 1948.[15] C. Song, Z. Qu, N. Blumm, and A.-L. Barab´asi, “Lim-its of predictability in human mobility,”

Science , vol. 327,no. 5968, pp. 1018–1021, 2010.[16] P. Kareiva and N. Shigesada, “Analyzing insect movementas a correlated random walk,”

Oecologia , vol. 56, no. 2-3,pp. 234–238, 1983.[17] F. Chierichetti, R. Kumar, P. Raghavan, and T. Sarlos,“Are web users really markovian?,” in

Proceedings of the21st International Conference on World Wide Web , WWW’12, (New York, NY, USA), p. 609–618, Association forComputing Machinery, 2012.[18] P. Singer, D. Helic, B. Taraghi, and M. Strohmaier, “De-tecting memory and structure in human navigation patternsusing markov chain models of varying order,”

PLOS ONE ,vol. 9, pp. 1–21, 07 2014.[19] M. C. Gonzalez, C. A. Hidalgo, and A.-L. Barabasi, “Un-derstanding individual human mobility patterns,” nature ,vol. 453, no. 7196, pp. 779–782, 2008.[20] M. F. Heath, M. C. Vernon, and C. R. Webb, “Constructionof networks with intrinsic temporal structure from uk cattlemovement data,”

BMC Veterinary Research , vol. 4, no. 1,p. 11, 2008.[21] D. Balcan and A. Vespignani, “Phase transitions in con-tagion processes mediated by recurrent mobility patterns,”

Nature physics , vol. 7, no. 7, pp. 581–586, 2011.[22] C. Poletto, M. Tizzoni, and V. Colizza, “Human mobilityand time spent at destination: Impact on spatial epidemicspreading,”

Journal of Theoretical Biology , vol. 338, pp. 41– 58, 2013.[23] P. J. Mucha, T. Richardson, K. Macon, M. A. Porter, andJ.-P. Onnela, “Community structure in time-dependent,multiscale, and multiplex networks,”

Science , vol. 328,no. 5980, pp. 876–878, 2010.[24] G. Palla, N. P´all, A. Horv´ath, K. Moln´ar, B. T´oth,T. Kov´ats, G. Surj´an, T. Vicsek, and P. Pollner, “Com-plex clinical pathways of an autoimmune disease,”

Journalof Complex Networks , vol. 6, pp. 206–214, 10 2017.[25] R. Lambiotte, M. Rosvall, and I. Scholtes, “From networksto optimal higher-order models of complex systems,”

NaturePhysics , vol. 15, no. 4, pp. 313–320, 2019.[26] D. Edler, L. Bohlin, et al. , “Mapping higher-order networkﬂows in memory and multilayer networks with infomap,”

Algorithms , vol. 10, no. 4, p. 112, 2017.[27] M. Rosvall, A. V. Esquivel, A. Lancichinetti, J. D. West,and R. Lambiotte, “Memory in network ﬂows and its eﬀectson spreading dynamics and community detection,”

Naturecommunications , vol. 5, no. 1, pp. 1–13, 2014.[28] N. Masuda, M. A. Porter, and R. Lambiotte, “Randomwalks and diﬀusion on networks,”

Physics Reports , vol. 716-717, pp. 1 – 58, 2017.[29] I. Scholtes, “When is a network a network? multi-ordergraphical model selection in pathways and temporal net-works,” in

Proceedings of the 23rd ACM SIGKDD interna-tional conference on knowledge discovery and data mining ,pp. 1037–1046, 2017.[30] S. Arlot, A. Celisse, et al. , “A survey of cross-validationprocedures for model selection,”

Statistics surveys , vol. 4,pp. 40–79, 2010.[31] G. C. Cawley and N. L. Talbot, “On over-ﬁtting in modelselection and subsequent selection bias in performance eval-uation,”

J. Mach. Learn. Res. , vol. 11, p. 2079–2107, Aug.2010. [32] M. G. Kendall, “A new measure of rank correlation,”

Biometrika , vol. 30, pp. 81–93, 06 1938.[33] J. A. Otter, S. Mookerjee, F. Davies, F. Bolt, E. Dyakova,Y. Shersing, A. Boonyasiri, A. Y. Weiße, M. Gilchrist, T. J.Galletly, E. T. Brannigan, and A. H. Holmes, “Detectingcarbapenemase-producing Enterobacterales (CPE): an eval-uation of an enhanced CPE infection control and screen-ing programme in acute care,”

Journal of AntimicrobialChemotherapy , vol. 75, pp. 2670–2676, 06 2020.[34] J. Rissanen, “A universal data compression system,”

IEEETransactions on information theory , vol. 29, no. 5, pp. 656–664, 1983.[35] V. J¨a¨askinen, J. Xiong, J. Corander, and T. Koski, “Sparsemarkov chains for sequence data,”

Scandinavian Journal ofStatistics , vol. 41, no. 3, pp. 639–655, 2014.[36] P. B¨uhlmann, A. J. Wyner, et al. , “Variable length markovchains,”

The Annals of Statistics , vol. 27, no. 2, pp. 480–513, 1999.[37] C. Persson, L. Bohlin, D. Edler, and M. Rosvall, “Mapsof sparse markov chains eﬃciently reveal community struc-ture in network ﬂows with memory,” arXiv preprintarXiv:1606.08328 , 2016.[38] T. Hastie, R. Tibshirani, and J. Friedman,

The elements ofstatistical learning: data mining, inference, and prediction .Springer Science and Business Media, 2009.[39] J.-C. Delvenne, S. N. Yaliraki, and M. Barahona, “Stabilityof graph communities across time scales,”

Proceedings of theNational Academy of Sciences , vol. 107, no. 29, pp. 12755–12760, 2010.[40] J. Xie, S. Kelley, and B. K. Szymanski, “Overlapping com-munity detection in networks: The state-of-the-art andcomparative study,”

ACM Comput. Surv. , vol. 45, Aug.2013.[41] A. Arnaudon, R. L. Peach, and M. Barahona, “Scale-dependent measure of network centrality from diﬀusion dy-namics,”

Phys. Rev. Research , vol. 2, p. 033104, Jul 2020.[42] L. Page, S. Brin, R. Motwani, and T. Winograd, “Thepagerank citation ranking: Bringing order to the web.,”Technical Report 1999-66, Stanford InfoLab, November1999. Previous number = SIDL-WP-1999-0120.[43] C. Villani,

Optimal transport: old and new , vol. 338.Springer Science and Business Media, 2008.[44] R. L. Peach, D. Saman, S. N. Yaliraki, D. R. Klug, L. Ying,K. R. Willison, and M. Barahona, “Unsupervised graph-based learning predicts mutations that alter protein dynam-ics,” bioRxiv , 2019.[45] R. L. Peach, S. N. Yaliraki, D. Lefevre, and M. Barahona,“Data-driven unsupervised clustering of online learner be-haviour,”

NPJ science of learning , vol. 4, no. 1, pp. 1–11,2019.[46] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefeb-vre, “Fast unfolding of communities in large networks,”

Journal of Statistical Mechanics: Theory and Experiment ,vol. 2008, p. P10008, oct 2008.[47] R. L. Peach, A. Arnaudon, and M. Barahona, “Semi-supervised classiﬁcation on graphs using explicit diﬀusiondynamics,”

Foundations of Data Science , vol. 2, no. 1, p. 19,2020. dditional Files

Additional ﬁle 1 — State networks of M and M . Figure (9) State networks of M and M . Additional ﬁle 2 — PageRank diﬀerencebetween M and M over specialities andbuildings. Additional to analysing ward PageRanks between M and M , we summed up the PageRanks of wardsbelonging to specialities and buildings to arrive at theirvisitation probabilities. Figure 10 A & B show thecomparative results, and whilst their is less dispersionwhen compared to ward PageRanks, this makes sensegiven that specialities are more coarse groupings, andlikely hide the ward variations seen previously. Building 1Building 5 Building 12Building 15Building 16Building 17

M1 Pagerank M P age r an k B CardiologyEmergency MedicineHaematology MedicineOncology PaediatricPrivateRenalRespiratory Surgery

M1 Pagerank M P age r an k A Figure (10) PageRank diﬀerence between M and M over specialities and buildings. Additional ﬁle 3 — Optimisation of clus-tering rate.

In order to select a clustering rate r for lumping weinvestigated it’s aﬀect on (1) the number of states inthe model, and (2) how well structures of patient move-ment can be detected in communities from the MSframework. We refer to this as model ﬁtness , whichis aggregate amount structures (hospital sites, speciali-ties, and buildings) found signiﬁcantly over-represented in MS communities for t > .

316 (threshold in t corre-sponding to the point of 20 partitions regardless of theclustering rate).Figure 11A shows the linear relationship between thenumber of states and r , whereby increases in r leadto a greater number of state nodes (i.e. less lump-ing). Whereas Figure 11B, the ﬁtness curve, showsthat for the same parameter range in r that the modelﬁtness does not change linearly. In fact, we observe a local peak in ﬁtness around r = 0 .

35 whereby the totalnumber of state nodes has reduced substantially to 171state nodes. We hypothesise this point retains impor-tant structure of patient movement in its communitieswhilst removing redundant state nodes.

Cluster Rate S t a t e N ode s Size of State Network over Clustering Rate

Cluster Rate F i t ne ss Network Fitness over Clustering Rate AB Figure (11) Optimisation of clustering rate r forlumping state nodes. (A) The increasing number ofstate nodes (more granular lumping) with increasinglumping rate r , (B) the resultant model ﬁtness as afunction of the clustering rate r . Additional ﬁle 4 — Markov stability runstatistics.

Figure 12 shows the resultant statistics from runningMS over the lumped state network ˆ M . Additional ﬁle 5 — Markov stabilitycommunity partitions.

We ﬁnd that MS produces a hierarchy of communitypartitions across Markov time t . When t is smallerthe resultant partitions are granular, and consequentlymore numerous, then as t increases partitions becomecoarse by merging granular communities together. Fig-ure (Figure 13shows the full community mapping over t for ˆ M , with the three time points t t

2, and t t , for asimpliﬁed visualisation in the main section Additional ﬁle 6 — Variation of Informa-tion between hospital structures in com-munity partitions.

Similar to MS we can compute the Variation of In-formation (VI) to assess distance between clustering,except here we can turn to how well over t the resul-tant partitions confer to our known structures in thehospital (sites, buildings and specialities)(Figure 14).As t increases all structures become more aligned withigure (12) Markov Stability Analysis. Top: thenumber of communities and the Markov Stability asa function of Markov time. Middle: The Variationof Information computed over the set of Louvain op-timisations at each Markov time, whereby a low VIcorresponds to a robust partition. Bottom: The com-bined Variation of Information and number of commu-nities. The heatmap represents the Variation of In-formation computed between the optimal partition ateach Markov time, where the diagonal is zeros, and welook for blocks of low VI that indicate robust parti-tions. Markov time

Granular Coarse

Figure (13) Sankey diagram showing full MS com-munity Partitions over t with granular partitions cap-tured towards the left, and coarse partitions capturedtowards the right.MS communities, however, hospital sites seems to con-fer far better across t , even with an initial high VI therate. Furthermore, Hospital exhibits a faster decreaserate when compared to Speciality or Buildings, andsuggests that coarser communities confer most to hos-pital sites. However, the comparatively smaller VI forHospitals across more granular MS communities alsosuggests presence of within hospital structures of pa-tient movement, not bound solely by buildings or spe- cialities. M t _0 . M t _0 . M t _0 . M t _0 . M t _0 . M t _0 . M t _0 . M t _0 . M t _0 . M t _0 . M t _1 M t _1 . M t _1 . M t _1 . M t _1 . M t _1 . M t _1 . M t _2 . M t _2 . M t _2 . M t _3 . M t _3 . M t _3 . M t _4 . M t _5 . M t _5 . M t _6 . M t _7 . M t _7 . M t _8 . M t _10 M t _11 M t _12 M t _15 M t _20 Markov time V I group SpecialtyBuildingHospital

Variation of Information

Figure (14) The Variation of Information computedbetween each hospital structure partition and the com-munity partitions found at each Markov time.

Additional ﬁle 7 — 2-way communitypartition to hospital site.

Site 3Site 2Site 1Community 1Community 2

Figure (15) The Markov Stability community parti-tion at Markov time t = 20 and their assignments tohospital sites. Additional ﬁle 8 — Hospital wards over-lapping communities across Markov sta-bility partitions. Mt O v e r l app i ng P h ys i c a l N ode s Speciality

Cancer Ward Cardiac Endoscopy Haematology Neurology Private Respiratory Surgery Critical Care Oncology Medicine Elderly Care Renal

Figure (16) The frequency of physical wards that aremembers of more than one MS community as a functionof Markov time t . For example, the Renal speciality hasfour wards that overlap between diﬀerent communitiesfor the majority of Markov time. Additional ﬁle 9 — Multiscale Centralitymodel comparison.

For further examination of the importance of higher-order modelling, we compared the MSC ranking ofwards in the lumped network ˆ M to the original statenode networks of M and M . We found that whilst edicine 13Critical Care 2Medicine 10 Medicine 5 Renal 1Renal 3Medicine 16Critical Care 4Surgery 6Haematology 3Elderly Care 5 Medicine 2Surgery 1Respiratory 1 Cardiac 3Critical Care 1Cardiac 4 Surgery 4 Private 6Critical Care 3Cardiac 2Surgery 14Surgery 3Medicine 15 Emergency 2 Median Scaled MSC M ed i an S c a l ed M S C vs Renal 3Surgery 6 Medicine 2Surgery 1 Cardiac 3Medicine 4 Cardiac 4Private 6Cardiac 2 Surgery 14Medicine 15Emergency 2

Median Scaled MSC M ed i an S c a l ed M S C vs Medicine 13Medicine 5Renal 1Medicine 16Critical Care 4Elderly Care 5 Critical Care 1Private 6Critical Care 3

Median Scaled MSC M ed i an S c a l ed M S C vs Figure (17) A comparison of the median Multiscale centrality for the ﬁrst-order M , second-order M andlumped ˆ M memory networkscorrelated, there were a number of distinct diﬀerencesbetween the models (Figure 17).For instance, we found several wards, including acritical care ward that were central at all time-scales in M and ˆ M only appeared as important at short time-scales in M . We found that the MSC node ranking forˆ M was marginally more correlated with M (RankedCor: 0.86 (pval < M Ranked Cor: 0.84(pval < M is closer in sizeto M than M2