[PDF] 21 Million Opportunities: A 19 Facility Investigation of Factors Affecting Hand Hygiene Compliance via Linear Predictive Models

Abstract

This large-scale study, consisting of 21.3 million hand hygiene opportunities from 19 distinct facilities in 10 different states, uses linear predictive models to expose factors that may affect hand hygiene compliance. We examine the use of features such as temperature, relative humidity, influenza severity, day/night shift, federal holidays and the presence of new medical residents in predicting daily hand hygiene compliance; the investigation is undertaken using both a "global" model to glean general trends, and facility-specific models to elicit facility-specific insights. The results suggest that colder temperatures and federal holidays have an adverse effect on hand hygiene compliance rates, and that individual cultures and attitudes regarding hand hygiene exist among facilities.

Full PDF

NNoname manuscript No. (will be inserted by the editor)

21 Million Opportunities: A 19 Facility Investigation ofFactors Aﬀecting Hand Hygiene Compliance via LinearPredictive Models

Michael T. Lash MS · Jason Slater BS · Philip M. Polgreen MD MPH · Alberto M.Segre PhD

Acknowledgements:

The authors would like to thank GOJO Industries, Inc. for access tothe hand-hygiene data.Michael T. LashDepartment of Computer Science, University of Iowa, 319-335-0808, E-mail: [email protected] SlaterGOJO Industries, Inc.Philip M. PolgreenDepartment of Epidemiology, University of IowaAlberto M. SegreDepartment of Computer Science, University of Iowa a r X i v : . [ c s . C Y ] J a n Michael T. Lash MS et al.

Abstract

This large-scale study, consisting of 21.3 million hand hygiene opportu-nities from 19 distinct facilities in 10 diﬀerent states, uses linear predictive modelsto expose factors that may aﬀect hand hygiene compliance. We examine the use offeatures such as temperature, relative humidity, inﬂuenza severity, day/night shift,federal holidays and the presence of new medical residents in predicting daily handhygiene compliance; the investigation is undertaken using both a “global” modelto glean general trends, and facility-speciﬁc models to elicit facility-speciﬁc in-sights. The results suggest that colder temperatures and federal holidays have anadverse eﬀect on hand hygiene compliance rates, and that individual cultures andattitudes regarding hand hygiene exist among facilities.

Keywords

Hand hygiene, predictive analytics, linear regression, marginal eﬀectsmodeling, feature ranking

Healthcare associated infections represent a major cause of morbidity and mortal-ity in the United States and other countries [1]. Although many can be treated,these infections add greatly to healthcare costs [2]. Furthermore, the emergence ofmultidrug resistant bacteria have greatly complicated treatment of healthcare as-sociated infections [3], making the prevention of these infections even more impor-tant. One of the most eﬀective interventions for preventing healthcare associatedinfections is hand hygiene [4]. Yet, despite international programs aimed at increas-ing hand hygiene [4, 5, 6], rates remain low, less than 50% in most cases [4, 6, 7].Because of the importance of hand hygiene in preventing healthcare associatedinfections, infection control programs are encouraged to monitor rates to encour-age process improvement [6, 8, 9]. In most cases, hand hygiene monitoring is doneexclusively by human observers, which are still considered the gold standard formonitoring [7]. Yet, human observations are subject to a number of limitations.For example, human observers incur high costs and there are diﬃculties in stan-dardizing the elicited observations. Also, the timing and location of observers cangreatly aﬀect the diversity and the quantity of observations [10, 11]. Furthermore,the distance of observers to healthcare workers under observation and the relativebusyness of clinical units can adversely aﬀect the accuracy of human observers [11].The presence of human observers may artiﬁcially increase hand hygiene rates tem-porarily just as the presence of other healthcare workers can induce peer eﬀects toincrease rates [12, 13]. Finally, the number of human observations possible is quitesmall in comparison to the number of opportunities [7, 12].As a consequence, several automated approaches to monitoring have been pro-posed [8,14,15,16]. Many of these measure hand hygiene upon entering and leavinga patient’s room. The subsequent activation of a nearby hand hygiene dispenser isrecorded as a hand hygiene opportunity fulﬁlled whereas, if no such activation isobserved, the opportunity is not satisﬁed. Such approaches, while not capturing allﬁve moments of hand hygiene, do provide an easy and convenient measure of handhygiene compliance. With automated approaches becoming more common, a moreongoing and comprehensive picture of hand hygiene adherence should emerge,providing new insights into why healthcare workers abstain from practicing handhygiene.

In this work (an extension of [17]), we provide an in-depth exploration of factorsaﬀecting hand hygiene compliance across multiple hospital facilities using linearpredictive models. door countersensors , which increment a counter anytime an individual goes in or out of aroom, and hand hygiene sensors , which increment a counter when soap or alcoholrub are dispensed. Additional supporting technology was also installed to collectand record timestamped sensor-reported counts. We provide a simple illustrationof how these technologies are used in Figure 1 and a picture of an instrumentedroom entrance in Figure 2. In this paper, we will use the term dispenser event to designate triggering and use of an instrumented hand hygiene dispenser and door event to designate the triggering of a counter sensor located on one of theinstrumented doors. (a) (b) (c) Fig. 1: A simple illustration of the sensors and corresponding infrastructure. In(a), healthcare workers enter and exit patient rooms that are ﬁtted with sensors,interacting with instrumented dispensers as they do; note that the sensor on thehand hygiene dispenser is internal, and not visible. In (b), these door and dispensercounts are intermittently sent to a wireless transmitter. In (c), these counts arerelayed via transmitter and stored in a database, along with other information,such as the room the counts came from and the time and date in which they weresent. Practically speaking, these sensors can be ﬁt to any sort of patient entrance/exit area, asdepicted in Figure 2. Michael T. Lash MS et al.

Fig. 2: A nurse applying hand hygiene rub upon leaving an instrumented patientarea. Note the door sensor highlighted by the red box.A total of 19 facilities in 10 states were outﬁtted with sensors; because ofprivacy concerns, we report only the state and CDC Division for each. The facilitiescomprise a wide range of geographies, spanning both coasts, the midwest, and thesouth. A total of 1851 door sensors and 639 dispenser sensors reported a totalof 24,525,806 door events and 6,140,067 dispenser events across these 19 facilitiesbetween October 21, 2013 and July 7, 2014. Each facility contributed an averageof 172.3 reporting days , making this study the largest investigation of hand hygienecompliance to date (i.e., larger than the 13.1 million opportunities reported in [18]).Assuming each door event corresponds to a hand hygiene opportunity, we estimatean average facility compliance rate of 25.03%, in line with if not just below thereported low-end rate found in [19].The original data, consisting of timestamped counts reported from individualsensors over short intervals, were re-factored to support our analysis. First, datafrom each sensor were binned by timestamp, t , into 12 hour intervals, correspondingto traditional day and night shifts, as indicated by an additional variable, night ,deﬁned as follows: nightShift = ( t (cid:15) [7pm , t (cid:15) [7am , hand hygiene compliance , or just compliance , by dividing the number of reported dispensed events by the number of door events: compliance = dispenser door Such a deﬁnition of compliance assumes that each door event corresponds to asingle hand-hygiene opportunity and each dispenser event corresponds to a single hand-hygiene event whereas, in reality, a health care worker might well be expectedto perform hand hygiene more than once per entry, resulting in rates that exceedone, if only slightly. This estimator also ignores the placement of doors with respectto dispensers: multiple dispensers may well be associated with a single doorway,and some dispensers may be in rooms having multiple doors. Thus, simply addingnew dispensers will raise apparent compliance rates computed in this fashion,while adding new door sensors will appear to reduce compliance. Even so, whenapplied consistently and if system layouts are ﬁxed, this estimator is a reasonableapproximation of true hand hygiene compliance, and supports sound comparisonswithin a facility (but not across facilities).Because malfunctioning sensors or dead batteries can produce outliers (i.e.,very low or very high values), shifts with fewer than 10 door or dispenser eventsreported per day (possibly indicating an installation undergoing maintenance),zero compliance, or compliance values greater than 1 were removed prior to analysis(at the cost of possibly excluding some legal records). The remaining data consistof 5308 shifts from the original 5647 records, having 21,273,980 hand hygieneopportunities and 5,296,749 hand hygiene events (see Table 1).

Facility State CDC Div Tot Disp Tot Door Days Rep

91 OH ENC 234292 518772 252101 OH ENC 350901 2021665 260105 TX WSC 238899 1940024 260119 MN WNC 123877 242939 156123 TX WSC 325618 1112198 243127 NM Mnt 1306855 4546171 260135 OH ENC 125731 264331 258144 CA Pac 398961 1744642 260145 CA Pac 567096 2073566 260147 CA Pac 500979 2462900 260149 CA Pac 590708 2306392 260153 CT New E 169564 603482 208155 NY M-At 171275 619507 117156 NC S-At 4381 38200 15157 OH ENC 39455 313396 101163 OH ENC 344 10233 5168 PA M-At 30421 86909 20170 IL ENC 112604 353631 47173 OH ENC 4788 15122 32Total 10 8 5296749 21273980 3274

Table 1: Descriptive statistics for all reporting facilities in terms of state, CDCdivision, hand hygiene events, people events, and reporting days.

Michael T. Lash MS et al.

Because health care workers frequently cite skin dryness and irritation as a factorin decreased compliance (particularly in cold weather months where environmen-tal humidity is reduced), we associate daily air temperature (denoted temp ) andrelative humidity (denoted humid ) to each timestamped record based on each fa-cility’s reported zip code. Spatially assimilated weather values ( σ = 0 . . ° latitude by 2 . ° longitude), the world is thus deﬁned as a 144by 73 grid having 10512 distinct grid elements. Weather data are available at a ﬁnelevel of temporal granularity (on the order of 4 times daily for each grid unit) forthe entire period of interest. The geographical assignment of weather data was ob-tained by ﬁrst mapping each facility’s numerical zipcode to the zipcode’s centroid(2010 US Census data), and then subsequently mapping zipcode centroid (lat,lon)to the corresponding NOAA grid element. An example of this assignment canbe observed in Figure 3. We associate weather information from the observationtemporally closest to the start of each shift. We conjecture that the local severity of common seasonal infectious diseases suchas inﬂuenza may also aﬀect hand hygiene compliance rates. We deﬁne inﬂuenzaseverity (denoted flu ) as the number of inﬂuenza-related deaths relative to alldeaths over a speciﬁed time interval.Inﬂuenza severity data were obtained from the CDC’s

Morbidity and MortalityWeekly Report (MMWR), which also reports data at weekly temporal granularity.Rather than reporting data by CDC region, however, data are provided by reportingcity (one of 122 participating cities, mostly large metropolitan areas). We mapeach facility in our dataset to the closest reporting city in order to associate theappropriate severity value to each record. In other words repCity = argmin { dist(facility , city i ) : i = 1 , . . . , } where dist(fac , city) , k (fac lat , fac lon ) , (city lat , city lon ) k , the Euclidean dis-tance between two entities given in terms of (lat, lon) coordinates. Eight of 19facilities were located in a reporting city (i.e., dist(fac,city)= 0). The remaining 11facilities were mapped to a reporting city that was, on average, 66.2 miles away(only 3 of 19 facilities were mapped to a reporting city further than this average,with the largest distance being 142 miles). We also conjecture that external factors associated with speciﬁc holidays or eventsmay aﬀect hand hygiene compliance rates. Holidays may change staﬃng rates oraﬀect healthcare worker behaviors. The number of visitors (aﬀecting door counterrates) may also be greater than during regular weekdays. Holidays such as the4th of July are often associated with alcohol-related accidents, and may increasehealth care facility workloads (similar factors may also apply on weekends).We deﬁne a new variable holiday that reﬂects whether a given shift occurs onone of the 10 federal holidays (New Year’s Eve, Martin Luther King Day, Presi-dent’s Day, Memorial Day, the 4th of July, Labor Day, Columbus Day, Veteran’sDay, Thanksgiving or Christmas) where, if any part of the shift (day/night) fallson the holiday in question, the indicator is set to 1. More formally: holiday = ( t / ∈ { holidays } t ∈ { holidays } Similarly, in order to ascertain the impact of weekends on compliance, we deﬁnea new variable weekday as follows: weekday = ( t ∈ { Sat, Sun } t ∈ { Mon, T ues, W eds, T hurs, F ri } Note here that if a shift spans the weekday into a weekend (or vice versa), it isencoded as a weekend.A related concept is the presence of new resident physicians, who traditionallystart work the ﬁrst of July. We deﬁne a new variable that corresponds with this

Michael T. Lash MS et al. time period in order to see if the data reveal the presence of a July eﬀect (denoted

July ): July = ( t / ∈ July − t ∈ July − M Ridge Regression for Feature Examination

With covariates deﬁned and associated with the collected sensor data, we wish tobuild a linear hypothesis h that (a) accurately estimates hand hygiene and (b) reports the direction and degree of eﬀect of our deﬁned features.In accomplishing (b) we bear in mind two things: (1) There may be multi-collinearity among features, which may adversely aﬀectthe output. (2)

That (a) and (b) may be at odds with one another; i.e., obtaining good pre-dictions may entail discarding some prediction-inhibiting features for whichwe would like to obtain eﬀect estimates (in practice, we ﬁnd that this is notactually the case).Therefore, we propose an M Ridge Regression for Feature Examination methoddesigned to accomplish (a) and (b) , while bearing (1) and (2) in mind. Thismethod is given by h ∗ = argmin h ∈H l k Λ ( X ) h − y k + λ k h k s.t. ρ ( h j ) ≤ . ∀ j (1)where X ∈ R n × p is a design matrix, h is the hypothesis, y is the target vectorconsisting of compliance rates in which a particular y i ∈ [0 , λ is a regularizationterm, k·k is the ‘ -norm, and ρ ( · ) is a function that reports the p-value of ahypothesis term (this constraint is ensured via sequential backwards elimination[21]). The function Λ ( X ) can be deﬁned as Λ ( X ) , argmin { t ∈ T H l } (2)where t is hypothesis selected from a tree of hypotheses constructed using the M p dimension, acting as a featureselection method, and having no bearing on the n dimension.There are a few beneﬁts of the above method worth noting. First, the hypothe-sis class H l is linear and common to both (1) and (2). Such two-stage optimizationapproaches, where the ﬁrst objective is optimized taking into account the hypoth-esis class before the hypothesis itself is optimized for predictive accuracy (or someother such measure), have been shown to work well in other contexts [23]. Sec-ondly, such a method is speciﬁcally geared toward producing a hypothesis thatmakes use of features that have an immediate bearing upon the problem, while eliminating interpretability obscuring eﬀects, such as multi-collinearity. Moreover,these desirables are obtained while attempting to produce the most accurate hy-pothesis: an h that elicits feature indicativeness, produces accurate results, andcontrols for confounding eﬀects is the goal of this two-step optimization procedure.Ultimately, we conduct our analysis by observing the sign and magnitude ofthe values in the hypothesis vector in order to determine the factors that inﬂuencehand hygiene compliance, and whether such factors aﬀect compliance in a positiveor negative manner. We also observe correlation and RMSE values to determinehow well our predictive model works, and whether the corresponding results canbe trusted. All results and are obtained via k -fold cross-validation ( k = 10). We also use two established/standard techniques – RReliefF feature ranking andmarginal eﬀects modeling – that will serve as a point of comparison between ourmethod, and also help inform the discussion of the results obtained . Feature ranking:

First, we propose the use of the RReliefF algorithm [26], amodiﬁcation of the original Relief algorithm of Kira and Rendell [27]. RReliefFﬁnds a feature j ’s weight by randomly selecting a seed instance x i from design ma-trix X and then using that instance’s k nearest neighbors to update the attribute.This description consists of three terms: the probability of observing a diﬀerentrate of hand hygiene compliance than that of the current value given that of thenearest neighbors, given by A = p (rate = rate x i | k NN( x i )) , (3)the probability of observing the current attribute value given the nearest neighbors,given by B = p ( x i,j | k NN( x i )) , (4)and the probability of observing a diﬀerent hand hygiene rate than the currentvalue given a diﬀerent feature value v and the nearest neighbors, given by C = p (rate = rate x i | k NN( x i ) ∧ j = v ) . (5)Attribute distance weighting is used in order to place greater emphasis on instancesthat are closer to the seed instance when updating each term; ﬁnal weights areobtained by applying Bayes’ rule to the three terms maintained for each attribute,which can be expressed C ∗ BA − (1 − C ) ∗ B − A . (6)By using this method we could then rank attributes in terms of their importance.We again report rankings using k -fold ( k = 10) cross validation. Marginal Eﬀects Modeling:

To provide additional insight into the featuresthat are relevant to hand hygiene we analyzed their marginal eﬀects [28]. Marginaleﬀects, also referred to as instantaneous rates of change , are computed by ﬁrst Note that both the LASSO [24] and Elastic Net [25] would have also made appropriatesupporting methods.0 Michael T. Lash MS et al. training a hypothesis h , then, using the testing data, the eﬀect of each covariatecan be estimated by holding all others constant and observing the predictions.Such a method can be expressed byˆrate i,j = h > [ x i,j , ¯ x = j ] (7)where, with a slight abuse of notation, x i,j , the value of instance i ’s j th feature,is added to the vector ¯ x = j , which consists of the average of each non- j feature, atthe appropriate location (namely, the j th position). Here, the notation = j is usedto reinforce the fact that the vector of averages ¯ x has it’s j th element replaced by x i,j . Other non- j entries are given by ¯ x k = µ ( X k ), for an arbitrary index position k . global models,where all facility records are used, and a facility-identifying feature is included. M Ridge Regression

We learned a hypothesis using all available features, including a nominalized fa-cility identiﬁer. Our predictive results can be observed in Table 2. We note thatthe RMSE is not large and the correlation is moderate, implying relatively goodpredictive performance.

Measure Value

Correlation 0.3441RMSE 0.1702

Table 2: Correlation coeﬃcient and RMSE of cross-validated model predictions. h ∗ We next examine the terms of the learned hypothesis h ∗ (see Table 3). The modelincludes unique identiﬁers for all 19 facilities, 12 of which had positive corre-sponding values, indicating relatively higher rates of compliance. The remainingfacilities’ h ∗ terms had relatively small negative values, indicating lower rates ofcompliance. Among other features, holidays are associated with lower compliancerates and inﬂuenza severity with higher compliance. Weekdays are associated withhigher compliance rates, as are higher temperatures and humidity. Interestingly,the M Feature h j facility − = { , , , , , } h j ∈ Fac − ∈ [ − . , − . facility + = { , , , , , , h j ∈ Fac + ∈ , , , , , } [0 . , . temp . humid . weekday . night − . holiday = { Indep Day , Pres. Day , h j ∈ Hol

Vet Day , New Year’s , Christmas } [ − . , − . flu . July − . Table 3: Feature speciﬁc h j terms, where red highlights features with a negativeassociation and blue highlights those with a positive association. Using RReliefF we can rank features in terms of their importance in order tosupport and supplement the result obtained using M facility was represented as a single discretely-valued feature in order to determine the importance of facility as a whole (insteadof treating each facility as its own feature), as was holiday . Attribute Avg Val Avg Rank facility . ± . flu .

007 2 temp .

005 3 . ± . weekday .

002 5 humid .

001 6 . ± . July ≈ . . ± . holiday ≈ . . ± . night ≈ . . ± . Table 4: RReliefF attribute weights.

The results obtained from modeling the marginal eﬀects can be observed in Figure4. Figures 4a and 4b show the marginal eﬀects of two randomly selected facilities;one identiﬁed as being associated with lower rates of compliance and one identiﬁedas having higher rates of compliance (from Table 3). Note that, because these arebinary features (taking on values of either zero or one), the kernel density of theunderlying data is not readily visible (unlike the other ﬁgures, which show resultsfor non-binary features). As we can see the marginal eﬀects support the resultobtained using both M flu .(d) humid . (e) temp . Fig. 4: The marginal eﬀects of several select covariates, where blue shows the kerneldensity of the original data and the red lines show the estimation. Rate (y-axis)vs. feature (x-axis). Note that in 4a and 4b no kernel density estimate is provided,as these plots are for binary features. an even greater association between facilities and rates of compliance than wasoriginally apparent (at least for these two facilities).Figure 4c shows the marginal eﬀects of inﬂuenza severity. The flu result showsa slightly positive relationship between the severity of ﬂu, measured in terms ofmortality, and hand-hygiene compliance rates. This is further supported by theresult obtained from M M To further explore the relationship between hand-hygiene and weather eﬀects,we conducted a simple statistical analysis. For each facility, we selected the tem-perature and humidity values corresponding to the bottom 10% and top 10% ofhand-hygiene compliance rates. We then performed a paired t-test on each set ofsamples; temperature and humidity values were scaled to [0 , Facility State temp humid µ top − µ bot (p-val) µ top − µ bot (p-val)91 OH -0.004 (0.750) -0.007 (0.489)101 OH 0.001 (0.909) 0.004 (0.457) TX 0.041 ( < . TX 0.017 (0.002) 0.029 ( < . NM 0.032 ( < . < . CA 0.009 ( < . CA 0.011 ( < . CT 0.043 ( < . NY 0.093 ( < . NC 0.040 (0.007) -0.041 (0.445)157 OH -0.132 ( < . OH 0.180 (0.010) 0.179 (0.021)

PA 0.012 (0.122) 0.071 (0.006)170 IL -0.001 (0.772) -0.007 (0.642)

OH 0.037 (0.003) -0.033 (0.440)

Table 5: The diﬀerence in means and paired t-test p-value results, obtained bycomparing temperature/humidity values among the bottom 10% and top 10% ofhand-hygiene compliance rates, by facility ( boldened blue indicates that eithertemperature, humidity, or both have a positive diﬀerence in means and a p-value ≤ . µ top 10 > µ bottom 10 . Such results indicates that higher temperatures and levels of humidity (particularly temperature) are statis-tically associated with higher rates of hand hygiene. However, we ﬁnd that somefacilities co-located in the same geographic region have conﬂicting statistical re-sults (e.g., Facs. 91, 173). We conjecture that such a result may attributable todiﬀerences in sensor deployment location, but we leave such an investigation asfuture work.3.2 Facility-Speciﬁc ModelingThe full M M Ridge Regression

The facility-speciﬁc M Fac

155 0.5907 0.0658153 0.2089 0.0991149 0.1168 .0489123 0.6193 0.11127 0.7133 .031391 0.5384 0.0939101 0.3751 0.0442170 0.0645 0.0607168 0.362 0.0794

Table 6: Facility-speciﬁc M We now turn to examining the terms of each facility-speciﬁc hypothesis vector,which can be observed in Table 7. Note that, for the sake of simplicity in analyzingthese features, we have created a single, binary holiday feature (as opposed tohaving a feature for each holiday, as in our global model). facility temp humid weekday flu holiday night July

147 0 . . − .

937 NA − . Table 7: Hypothesis vector terms for each facility-speciﬁc model.In examining Table 7, we wish to ﬁrst point out that, relative to the globalmodel result reported in Table 3, that all facility-speciﬁc models had at leastone term that was removed via sequential backwards elimination. Moreover, theseeliminated terms diﬀer by facility, demonstrating that local models are sensitiveto diﬀerent features in diﬀerent ways.In examining the hypothesis terms, some interesting ﬁndings emerge. Withrespect to our weather-based features – temperature and humidity – we can seethat, for the most part, these factors were positively associated with higher rates ofhand hygiene compliance and, for certain facilities (147, 155, 123), these featuresappear to be fairly important (based on the magnitude of the coeﬃcients). Twofacilities, however, have a negative association with temperature and compliance.These coeﬃcients, however, are relatively small and are oﬀset by positive associ-ations among humidity: in other words, the eﬀects of temperature on compliancerates at these facilities appear to be somewhat negligible.In examining weekday and holiday , we can see that in all but one facility, weekday has a positive inﬂuence on hand hygiene rates. This suggests that em-ployees that work during weekends at these facilities may be washing their handsless; this may be attributable to a number of factors (increased work load, etc.).The holiday feature, on the other hand, tends to be indicative of lower rates ofcompliance among the three facilities reporting a non-zero term in their hypothesisvector (i.e., facilities 91, 101, 127).The night and

July features also tend to be negatively associated with handhygiene compliance, with

JulyEffect being universally associated with negativerates of compliance (among the three facilities for which this term was not elimi-nated). night , by contrast, had two facilities which were found to have a positiveterm for this feature. These may be hospitals where there is relatively less activ-ity at night (less busy); however, further investigation is needed to tease out thereasons individual facilities experience these diﬀering rates.Finally, flu appears to have a mix of positive and negative associations amongfacilities. In those facilities that have negative associations, a campaign focusing on ﬂu awareness may be beneﬁcial; however, lower rates may be attributable toincreased activity during peak ﬂu season, which may also suggest the need forhigher staﬃng levels – further investigation is needed to uncover the reasons behindthese associations.

In this subsection we discuss the results of RReliefF feature ranking obtained foreach of the 10 facilities being investigated; the results are presented in Table 8.

Fac temp humid weekday flu holiday night July

147 2 5.7 3.8 1 5.2 4 6.3155 1 2 4.8 4.8 5.2 3.2 7153 2.2 5.8 4.5 1 4.7 3.1 6.7149 4.3 5.1 1.2 2.1 6 6.6 2.7123 1 4.1 5.4 3 4.3 7 3.2127 2.8 3.5 6.5 4.4 1 3.3 6.591 3.9 3 5.5 2.1 1 5.5 7101 3.1 7 5.3 4 3 4.6 1170 1.9 4.4 3.9 1.2 4.3 6.9 5.4168 1.5 3.8 2.9 1.5 6.3 6.6 1.8

Table 8: Facility-speciﬁc RReliefF feature rankings.The ﬁrst observation we wish to make is that there is no single feature thatcompletely dominates the feature rankings among the diﬀerent facilities. This sug-gests that facilities’ compliance rates are aﬀected diﬀerently by our selected fea-tures. However, we can also that some features are often ranked as being moreimportant, while others as less important. For instance, temp is frequently one ofthe top three features, while

July more often appears toward the bottom of theranking. It is important to note here, however, that while

July , weekday , night ,and holiday appear toward the end of the feature ranking for some facilities, theyappear towards the top for others. The flu feature also frequently appears in thetop three feature rankings among facilities, while humid often appears somewherenear the middle of the rankings. The facility-speciﬁc marginal eﬀects modeling results are presented in Figure 5.Note that we are reporting only a subset of results, which include temp , humid , weekday , and flu .Cumulatively, these results further support what we have already discussed,with a few observational caveats. First, temperature is found to be universallyindicative of higher rates of compliance, which was found to not be entirely truefor facilities 91 and 101; these coeﬃcients are likely obscured by some degree ofmulticollinearity with other features – the same is true of humid . weekday and flu , as in the other results, are found to be mostly indicative of higher rates ofcompliance, with the exception of a few facilities. Facility temp temp weekday flu

Fig. 5: Facility-speciﬁc marginal eﬀects modeling results.

In this section we discuss the broader implications of our ﬁndings, as well asdirections for future work.The global results, including the full M weekday and holiday had a large bearing on hand hygiene compliance pre-dictions (i.e., these factors were important predictors of compliance). Fourth, ourconjectures that higher humidity and temperature are indicative of higher ratesof compliance were conﬁrmed by the full model, marginal eﬀects model, and sta-tistical analysis. This ﬁnding is important as health care workers often cite skinirritation or dry skin as reasons for reduced frequency of hand hygiene. Thesesame factors were also strongly suggested by our facility-speciﬁc modeling. Fifth,we found that compliance during the ﬁrst week of residents’ attendance ran con-trary to our original conjecture: the July was essentially unobservable. Howeverwe did ﬁnd that select facilities (153, 101, and 168) had this as an inﬂuencing fac-tor (particularly 101 and 168). Finally, we found that night was associated withslightly lower compliance rates. However, as our facility-speciﬁc modeling exposed,some facilities (149, 123) appear to have slightly higher rates of compliance duringthe evening; although, it is worth noting that, for these facilities, night was at thebottom of the RReﬂiefF feature ranking (indicating relatively low importance).Diﬀerent facilities have diﬀerent factors that aﬀect compliance rates diﬀerently:no two facilities are alike. While many of the facilities have factors that inﬂuencecompliance rates in similar ways – positive or negative (e.g., temperature) – theydiﬀer in degree (how much these common factors inﬂuence compliance) and com-position (the speciﬁc set of non-zero terms in the hypothesis vector h ∗ ). Cumula-tively, we can see that factors aﬀecting hand hygiene compliance among facilitiesis a complicated topic requiring further investigation.This work has several limitations. First, there are diﬀerences among installa-tions: not all doors and dispensers may be instrumented and, therefore, we cannottrack, for example, the use of personal alcohol dispensers (we can only assumestable practices within facilities). Thus our compliance estimates may be basedon partial information and are certainly not comparable across facilities. Second,our compliance estimates are facility wide, meaning that we do not exploit the co- location of dispensers and door event sensors, but only the temporal correlationof the individual events. Thus, our assumption that each door event correspondsto a hand-hygiene opportunity may be fundamentally ﬂawed, even as it allowsfor consistent intra-facility comparisons. Third, we acknowledge the possibility oflocation and sampling bias with regard to both the sensors and facilities. If sensorswere to be placed in only the ICU of one facility and in the emergency room ofanother, we may observe diﬀerent rates, which may be entirely reasonable andexpected in clinical practice. Additionally, though facilities are distributed acrossthe United States, they are by no means meant to be a representative sample offacility types or climatic conditions.In our future endeavors we would ﬁrst like to consider alternative deﬁnitions ofcompliance and examine compliance at ﬁner-grained temporal levels, perhaps ex-ploring time-series analyses. We intend to also explore framing the problem as oneof classiﬁcation, rather than only regression, which may help tease out additionalartifacts. Finally, data pertaining to compliance rates under certain interventionswould give way to exploration of intervention eﬃcacy both in general and us-ing prediction-based methodology, such as inverse classiﬁcation, to recommendfacility-speciﬁc intervention policies [29, 30].Hand hygiene compliance is a simple yet eﬀective method of preventing thetransmission of disease, both among the population at large, and within healthcare facilities, yet there have been few attempts to study the factors that canaﬀect compliance. This study presents a ﬁrst look at factors that underlie healthcare worker hand-hygiene compliance rates, including weather conditions, holidaysand weekends, and infectious disease prevalence and severity, and serves as a modelfor future studies that will exploit the availability of temporally and spatially richcompliance data collected by the sophisticated sensor systems now being put intopractice. Philip M. Polgreen has received research funding from Company GOJO Industries,Inc. Author Jason Slater is an employee of GOJO Industries, Inc.

References

1. R. Klevens, J. Edwards, C. Richards, and T. Horan, “Estimating health care-associatedinfections and deaths in us hospitals,”

Public Health , no. 122, pp. 160–166, 2007.2. R. Roberts, R. Scott, B. Hota, L. Kampe, F. Abbasi, S. Schabowski, I. Ahmad,G. Ciavarella, R. Cordell, S. Solomon, R. Hagtvedt, and R. Weinstein, “Costs attributableto healthcare-acquired infection in hospitalized adults and a comparison of economic meth-ods,”

Medical Care , vol. 48, no. 11, pp. 1026–1035, November 2010.3. R. Roberts, B. Hota, I. Ahmad, R. Scott, S. Foster, F. Abbasi, S. Schabowski, L. Kampe,G. Ciavarella, M. Supino, J. Naples, R. Cordell, S. Levy, and R. Weinstein, “Hospital andsocietal costs of antimicrobial-resistant infection in a chiago teaching hospital: implicationsfor antibiotic stewardship,”

Clinical Infectious Diseases , vol. 49, no. 8, pp. 1175–1184,October 2009.4. J. M. Boyce and D. Pittet, “Guidelines for hand hygiene in health-care settings: rec-ommendations of the healthcare infection control practices advisory committee and thehicpac/shea/apic/idsa hand hygiene task force,”

Infection Control and Hospital Epidemi-ology , no. 23, pp. S3–S41, 2002.0 Michael T. Lash MS et al.5. B. Allegranzi, H. Sax, L. Bengaly, H. Richet, D. Minta, M. Chraiti, F. Sokona, A. Gayet-Ageron, P. Bonnabry, and D. Pittet, “World health organization ”point g” project manage-ment committee. successful implementation of the world health organization hand hygieneimprovement strategy in a referral hospital in mali, africa,”

Infection Control and HospitalEpidemiology , vol. 31, no. 2, pp. 133–141, February 2010.6. D. Pittet, B. Allegranzi, and J. Boyce, “World health organization world alliance for pa-tient safety ﬁrst global patient safety challenge core group of experts. the world healthorganization guidelines on hand hygiene in health care and their consensus recommen-dations,”

Infection Control and Hospital Epidemiology , vol. 30, no. 7, pp. 611–622, July2009.7. J. P. Hass and L. E. L., “Measurement of compliance with hand hygiene,”

Journal ofHospital Infection , no. 66, pp. 6–14, 2007.8. J. Boyce and M. Cooper, T anda Dolan, “Evaluation of an electronic device for real-timemeasurement of alcohol-based hand rub use,”

Infection Control and Hospital Epidemiol-ogy

Infection Control andHospital Epidemiology , vol. 33, no. 7, pp. 689–695, Jul. 2012, [PMID: 22669230].11. D. Sharma, G. Thomas, E. Foster, J. Iacovelli, K. Lea, J. Streit, and P. Polgreen, “Theprecision of human-generated hand-hygiene observations: a comparison of human observa-tion with an automated monitoring system,”

Infection Control and Hospital Epidemiology ,vol. 33, no. 12, pp. 1259–1261, December 2012.12. T. Eckmanns, J. Bessert, M. Behnke, and H. Gastmeier, P anda Ruden, “Compliance withantiseptic hand rub use in intensive care units: The hawthorne eﬀect,”

Infection Controland Hospital Epidemiology , no. 27, pp. 931–934, 2006.13. M. Monsalve, S. Pemmaraju, G. Thomas, T. Herman, and P. Segre, AM anda Polgreen,“Do peer eﬀects improve hand hygiene adherence among healthcare workers?”

InfectionControl and Hospital Epidemiology , vol. 35, no. 10, pp. 1277–1285, October 2014.14. V. Boscart, K. McGilton, A. Levchenko, G. Hufton, P. Holliday, and G. Fernie, “Ac-ceptability of a wearable hand hygiene device with monitoring capabilities,”

Journal ofHospital Infection , vol. 70, no. 3, pp. 216–222, November 2008.15. A. Venkatesh, M. Lankford, D. Rooney, T. Blachford, C. Watts, and G. Noskin, “Useof electronic alerts to enhance hand hygiene compliance and decrease transmission ofvancomycin-resistant enterococcus in a hematology unit,”

American Journal of InfectionControl , vol. 36, no. 3, pp. 199–205, April 2008.16. P. M. Polgreen, C. S. Hlady, M. a. Severson, A. M. Segre, and T. Herman, “Method forautomated monitoring of hand hygiene adherence without radio-frequency identiﬁcation.”

Infection control and hospital epidemiology : the oﬃcial journal of the Society of HospitalEpidemiologists of America , vol. 31, no. 12, pp. 1294–1297, 2010.17. M. T. Lash, J. Slater, P. M. Polgreen, and A. M. Segre, “A large-scale exploration offactors aﬀecting hand hygiene compliance using linear predictive models,” in

HealthcareInformatics, 2017 IEEE International Conference on (ICHI) , 2017, pp. 66–73. [Online].Available: http://ieeexplore.ieee.org/document/8031133/18. H. Dai, K. L. Milkman, D. A. Hofmann, and B. R. Staats, “The Impact of Time atWork and Time Oﬀ from Work on Rule Compliance: The Case of Hand Hygiene inHealthcare,”

Journal of Applied Psychology , vol. 100, no. 3, pp. 846–862, 2014. [Online].Available: http://papers.ssrn.com/sol3/papers.cfm?abstract id=242300919. C. Jarrin Tejada and G. Bearman, “Hand Hygiene Compliance Monitoring: the State ofthe Art,”

Current Infectious Disease Reports , vol. 17, no. 4, 2015. [Online]. Available:http://link.springer.com/10.1007/s11908-015-0470-020. E. Kalnay, M. Kanamitsu, R. Kistler, W. Collins, D. Deaven, L. Gandin, S. Iredell,S. Saha, G. White, Y. Zhu, a. Leetmaa, R. Reynolds, M. Chelliah, W. Ebisuzaki,W. Higgins, J. Janowiak, K. Mo, C. Ropelewski, J. Wang, R. Jenne, and D. Joseph, “TheNCEP/NCAR 40-Year Reanalysis Project,” pp. 437–471, 1996. [Online]. Available:21. N. R. Draper, H. Smith, and E. Pownell,

Applied Regression Analysis . Wiley New York,1966, vol. 3.22. J. R. Quinlan, “Learning with continuous classes,” in , vol. 92, 1992, pp. 343–348.1 Million Opportunities 2123. F. D. Johansson, U. Shalit, and D. Sontag, “Learning representations for counterfactualinference,” in , 2016.24. R. Tibshirani, “Regression shrinkage and selection via the lasso,”

Journal of the RoyalStatistical Society. Series B (Methodological) , pp. 267–288, 1996.25. H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,”

Journalof the Royal Statistical Society: Series B (Statistical Methodology) , vol. 67, no. 2, pp.301–320, 2005.26. M. Robnik-ˇSikonja and I. Kononenko, “An adaptation of relief for attribute estimation inregression,” in

Machine Learning: Proceedings of the Fourteenth International Conference(ICML97) , 1997, pp. 296–304.27. K. Kira and L. A. Rendell, “A practical approach to feature selection,” in

Proceedings ofthe ninth international workshop on Machine learning , 1992, pp. 249–256.28. R. Williams et al. , “Using the margins command to estimate and interpret adjusted pre-dictions and marginal eﬀects,”

The Stata Journal , vol. 12, no. 2, p. 308, 2012.29. M. T. Lash, Q. Lin, W. N. Street, J. G. Robinson, and J. Ohlmann, “Gen-eralized inverse classiﬁcation,” in

Proceedings of the 2017 SIAM InternationalConference on Data Mining (SDM’17) , 2017, pp. 162–170. [Online]. Available:https://doi.org/10.1137/1.9781611974973.1930. M. T. Lash, Q. Lin, W. N. Street, and J. Robinson, “A budget constrained inverse clas-siﬁcation framework for smooth classiﬁers,” in