The Role of Peer Influence in Churn in Wireless Networks
TThe Role of Peer Influence in Churn in Wireless Networks
Qiwei Han * † [email protected] Pedro Ferreira * ‡ [email protected] * Department of Engineering and Public Policy ‡ Heinz College † Instituto Superior TécnicoCarnegie Mellon University Carnegie Mellon University University of LisbonPittsburgh, PA 15213 Pittsburgh, PA 15213 Lisbon, 1049-001United States United States Portugal
ABSTRACT
Subscriber churn remains a top challenge for wireless carri-ers. These carriers need to understand the determinants ofchurn to confidently apply effective retention strategies toensure their profitability and growth. In this paper, we lookat the effect of peer influence on churn and we try to dis-entangle it from other effects that drive simultaneous churnacross friends but that do not relate to peer influence. Weanalyze a random sample of roughly 10 thousand subscribersfrom large dataset from a major wireless carrier over a pe-riod of 10 months. We apply survival models and generalizedpropensity score to identify the role of peer influence. Weshow that the propensity to churn increases when friends doand that it increases more when many strong friends churn.Therefore, our results suggest that churn managers shouldconsider strategies aimed at preventing group churn. Wealso show that survival models fail to disentangle homophilyfrom peer influence over-estimating the effect of peer influ-ence.
1. INTRODUCTION
Over the past two decades, wireless telecommunications mar-kets experienced rapid change and strong technological growthfrom 2G to 4G. According to [16], 6.8 billion mobile sub-scriptions worldwide saturate the wireless market. In par-allel, deregulation opened up markets to multiple entrantssupporting both competition and technological innovation.Consequently, carriers need to invest heavily in upgradingtheir networks to provide quality communications and novelservices as well as to ensure that they healthily profit fromexisting subscribers. However, subscribers have many providersto choose from and can ever more easily transfer from oneprovider to another as more information about products andservices abounds and switching costs reduce [5].Churn rates measure the proportion of subscribers discon-tinuing service during a certain period of time. As reportedby wireless carriers across the world, average monthly churnrates vary between 1.5% and 5% [33, 17, 8]. In other words,wireless carriers can lose 20% of their subscriber base ev-ery year, which poses significant challenges for profitabil-ity and growth. Subscriber churn may represent significanteconomic loss. This loss can be estimated by multiplyingthe average cost to acquire a new subscriber by the numberof subscribers that churn. With average acquisition costsreaching $600 per subscriber churn may cost the wireless in-dustry billions of dollars every year [3]. On the other hand,keeping an existing subscriber is generally much cheaper and easier. [33] showed that acquiring a new subscriber canbe five times harder than retaining an existing one becausethe latter is less sensitive to both price and sales referrals.Meanwhile, improving subscriber retention can help wire-less carriers increase margins [8]. Therefore, effective sub-scriber churn management becomes a priority for many tele-com managers as to ensure the sustainable growth of theircompanies.Wireless carriers aim at controlling churn through activesubscriber retention campaigns. For this purpose, they proac-tively identify subscribers with high propensity to churn,evaluate the underlying reasons for churn and devise strate-gies to prevent it. The perplexing nature of churn, however,makes it very difficult to explain and address churn in anefficient and comprehensive manner. Subscribers may churnfor many different reasons: they may be unsatisfied withthe quality of service; they may get attracted by competi-tors that provide lower prices; they may decide to acquirea new handset or service that is not currently provided bytheir carrier. Thus, wireless carriers can hardly provide onesingle solution to prevent all potential churners from leav-ing. Therefore, understanding the complexity of the churnproblem and disentangling the role of the several factorsthat can trigger it is fundamental to design sound retentionstrategies.In this paper, we look at one such complexity associated withchurn. We study the effect of peer influence on churn. Ifchurn is contagious then churn can snowball quickly leadingmany subscribers to leave the carrier, specially when socialnetworks are dense, as they locally tend to be in the case ofwireless services. We apply survival analysis and generalizedpropensity score to separate peer influence from homophilyand measure the effect of the former. We perform our em-pirical analysis on a massive dataset from a major Europeanwireless carrier that shared call detail records (CDRs) withus. We show how churn increases with friends’ churn, a re-sult suggesting that churn managers should prevent groupchurn instead of looking at churn on an individual basis.
2. RELATED WORKS2.1 Customer Lifetime Value (CLV)
Many researchers have proposed that looking at the valueof existing subscribers is essential because investing in allsubscribers will be inefficient [14]. [28] introduced the CLVmodel in the wireless industry to estimate the effect of re-tention campaigns based on CLV. CLV models help under- a r X i v : . [ c s . S I] S e p tand subscriber value and provide wireless carriers insightsto build cost-effective retention strategies targeted to sub-scribers with high CLV and low loyalty. [11] showed that theprofit of the wireless carrier is a function of the total sub-scriber lifetime value. In their theoretical CLV model forthe wireless telecommunication industry, the probability ofa subscriber to churn is treated as a parameter in the CLVfunction used to determine how long the subscriber will stayin the network generating future cash flows. Another gen-eralized CLV model proposed by [9] identified churners assubscribers with decreasing CLV. Therefore, the probabilityof churn is central to the notion of CLV and thus under-standing churn is paramount to correctly measure CLV. Today, wireless carriers gather wealthy data deemed use-ful to perform churn analysis. Numerous data mining tech-niques have been applied to transform these raw data intouseful knowledge. [13] described the general framework forthis purpose: (i) identify discriminatory features that candifferentiate a subscriber with high risk of churn from otherloyal subscribers; (ii) extract data from identified features;(iii) select the appropriate data mining techniques to builddescriptive or predictive models; (iv) evaluate the perfor-mance of these models according to specified criteria, e.g.lift curves. Extensive research has been done on churn pre-diction in wireless network. Refer to [25] and [32] for com-prehensive reviews on the application of different algorithmsto churn prediction and prevention.Three disadvantages of pure data mining techniques are worthnoting though. First, although many models and algorithmsseem to provide satisfactory accuracy in identifying churn-ers, the results obtained dependent not only on method butalso on the data used and the features considered by re-searchers. For example, both [24] and [14] used logistic re-gressions and neural networks to predict churn. The for-mer found that neural networks outperformed logistic re-gression. However, the latter concluded otherwise. Second,many data mining algorithms are like a “black box”, whichlack interpretability precluding us from understanding thetrue determinants of subscriber churn. As a result, an agentin a call center might be asked to call a certain subscriberbecause she is likely to churn. However, very little mighthave been said to this agent about the underlying reasonsthat may lead the subscriber to churn, which clearly difficul-ties the interaction with her. Third, a number of statisticalbased benchmarking measures of performance for data min-ing algorithms do not directly yield optimal results in termsof profit maximization from the practitioners’ perspective.[31] proposed a profit-centric performance criterion focusingon the fraction of subscribers that generate the most profitand showed that this approach yielded outcomes differentfrom the ones resting on the best approaches as evaluatedby statistically based performance measures.
Advances in studying the effect of social influence on sub-scriber churn in wireless networks have received much atten-tion in recent times. [6] tried to learn whether the propensityof a subscriber to churn depends on the number of friendsthat have already churned. This hypothesis is based onpremise that a few key individuals may lead to strong “word- of-mouth” effects. These individuals may influence theirfriends to churn, who in turn spread the message to others.So they identified likely churners as those subscribers whosefriends have already churned, using a spreading activation-based technique. A set of churners iteratively diffuse themessage to other subscribers. Then a subscriber churns oncethe accumulated level of influence reaches a certain thresh-old. [7] used Markov Logic Networks and propositionaliza-tion to develop a predictive model for churn. They alsoconfirmed that “word-of-mouth” has a significant impact onsubscriber’s churn decisions. [26] demonstrated that by in-tegrating social factors such as influence from churners intomachine learning models can enhance the prediction perfor-mance. However, correlation in the behavior among peoplewho share social ties can be explained by both peer influ-ence and their inherent similarities – homophily [22]. Workthat identifies contagious churn separating it from confound-ing effects such as homophily (friends tend to exhibit sim-ilar behavior) is still limited. The contribution of our pa-per is to analyze contagious churn avoiding misattributinghomophily to contagion, or/and vice versa, which typicallyleads to overestimate the latter.Along these lines, [21] provides a study close to ours, whichuses the same dataset to identify peer influence in the theadoption of the iPhone 3G. In their case, they show thatpeer influence led to roughly 14% of the observed iPhoneadoption during the first year after the introduction of thishandset in the market. A number of other studies on theidentification of peer influence in the presence of confound-ing effects in other networked contexts have been proposed[1,4, 19, 30, 20]. For example, [2] used dynamic propensityscore matching (PSM) to estimate the effect of contagionin the adoption of an online service by analyzing a commu-nity of instant messenger users. Their findings suggest thathomophily accounted for much of the adoption previouslyperceived as peer influence. However, they dichotomize thetreatment to use PSM and thus are unable to explore theheterogeneity in treatment levels.
3. DATA AND DESCRIPTIVE STATISTICS
We partnered with a major European wireless carrier, here-inafter called EURMO, whig gave us access to its Call De-tailed Records (CDRs) between August 2008 and May 2009.For each call we know the caller and the callee, the du-ration and time of the call and the id of the cell towerused to route the call. Subscribers are identified by theiranonymized phone number. For each subscriber, we knowtheir provider and tariff plan at all times. There are roughly4 million EURMO subscribers in our dataset.Understanding subscriber churn with prepaid plans is quitedifferent from working with postpaid subscribers. First, wehave very limited socio-demographic information on pre-paid subscribers. Second, the usage pattern of prepaid sub-scribers is more irregular than that of postpaid subscribers.Third, prepaid subscribers churn by ceasing usage whereaspostpaid subscribers explicitly inform the carrier when theyintend to do so. We use the standard in the industry, whichis also followed by EURMO and assumed that a prepaid sub-scriber churns if she places no calls for three of consecutivemonths.ull Sample Churner Non-Churnern=8,345 n=1,191 n=7,154Mean Std Mean Std Mean Std t-statCovariates Description [1] [2] [3] [4] [5] [6] [7]
Monthly usage n calls out
Calls made 18.99 29.74 6.51 12.80 21.07 30.89 p < . n calls in Calls received 22.96 29.45 7.84 16.44 25.48 30.70 p < . airtime out Airtime made 44.87 116.44 13.55 40.87 50.08 123.88 p < . airtime in Airtime received 55.54 116.63 17.71 50.15 61.83 123.17 p < . expenditure Expenditure 31.78 53.36 25.33 55.31 32.85 52.95 p < . Structural properties frd
Number of friends 8.40 9.16 5.76 8.91 8.84 9.13 p < . calls out other Ratio of calls made to othernetworks 0.24 0.23 0.27 0.27 0.24 0.23 p < . calls in other Ratio of calls received fromother networks 0.25 0.23 0.26 0.26 0.25 0.22 p < . tenure Time with EURMO 48.10 39.32 19.91 25.66 52.79 39.24 p < . Churner friends call frd churn Number of 1-call churnerfriends 1.13 2.23 0.81 2.05 1.18 2.25 p < . call frd churn Number of 3-call churnerfriends 0.56 1.25 0.40 1.15 0.59 1.26 p < . call frd churn Number of 5-call churnerfriends 0.20 0.60 0.14 0.52 0.21 0.61 p < . Table 1: List of covariates extracted from the EURMO network. Descriptive statistics are performed for theour random sample (columns [1] and [2]), churners (columns [3] and [4]) and non-churners (columns [5] and[6]), respectively. Column [7] tests the hypothesis that the means between churners and non-churners aresimilar.
We use a random sample of 10 thousand subscribers. Twosubscribers are called friends if they exchange at least onecall in each and every calendar month. We trim from ourrandom sample subscribers with very high degree, whichare likely to represent customer service and PBX machines,and with no degree (some subscribers purchase a SIM cardbut never use it to make calls). We are left with 8,345subscribers in our sample. We observe network dynamicsover time, namely new subscribers join EURMO and ex-isting subscribers leave EURMO every month. Moreover,subscribers call and/or text different friends over time. Weaggregate individual subscriber usage and structural prop-erties at the monthly level in our analysis. Table 1 shows de-scriptive statistics for covariates used in our study. Over theperiod of analysis, the subscribers in our sample placed 3.75million calls and 1,191 of them churned, which amounts toan average monthly churn rate of 2 . n - call frd churn , the friends who churnthat exchange at least n calls with the ego in the same cal-endar month. We find that 343 out of 1,191 (29%) churnersand that 3,197 out of 7,154 (45%) non-churners have at leastone friend that churned during the period of analysis. Table1 shows that on average subscribers see 1.13 friends churn,0.56 and 0.20 friends churn that exchange at least 3 and 5calls, respectively, with the ego in the same calendar month.
4. CHURN DYNAMICS
The panel structure of our data can be analyzed using sur-vival analysis to determine the impact of time-varying co-variates on churn [18]. Survival analysis also allows for con-trolling for some unobserved individual-level heterogeneity,i.e., some subscribers are more prone to churn for reasons-call 3-call 5-call 1-call 3-call 5-callCovariates [1] [2] [3] [4] [5] [6]Influence factor frd churn *** *** *** *** *** *** (0.104) (0.0787) (0.156) (0.0978) (0.120) (0.167)Usage Metrics n calls -0.00646 *** -0.00621 *** -0.00625 *** (0.00159) (0.00134) (0.00148) n calls *** *** *** (0.0000021) (0.0000018) (0.0000019) airtime -0.000829 ** -0.000801 *** -0.000819 * (0.000382) (0.000297) (0.000440) airtime expenditure *** *** *** *** *** *** (0.00159) (0.00127) (0.00150) (0.00158) (0.00120) (0.00184)Structural Metric frd -0.0746 *** -0.0659 *** -0.0671 *** -0.0898 *** -0.0803 *** -0.0827 *** (0.0161) (0.0143) (0.0151) (0.0150) (0.0115) (0.0184) month dummy Yes Yes Yes Yes Yes YesObservations 51,488 51,488 51,488 51,488 51,488 51,488 *** p < . ** p < . * p < . Table 2: Parameter estimates for Cox PH frailty models on frd churn at three thresholds: - call (columns [1]and [4]), - call (columns [2] and [5]) and - call (columns [3] and [6]) that are not captured in our data (e.g. marketing campaignsby competitors). We employ a Cox Proportional Hazard(PH) model with frailty to estimate the churn hazard rateas: h ( t, x i , x ( t − i ) = α i h ( t )exp[ βx i + δx ( t − i ]where h ( t, x i , x ( t − i ) is the churn hazard for subscriber i at time t , α i is the Gamma distributed frailty that repre-sents the individual-level random effect, h ( t ) is the non-paramatric baseline hazard function, x i and x ( t − i includetime-independent and time-varying covariates at time t , suchas usage and friends’ churn. We introduce a time-lag on thelatter covariates because, similarly to epidemiology cases,there is an induction and latency period between a friend’schurn and the ego’s churn. Moreover, using lagged covariatesobviates simultaneity problems with our estimation. Finally,we also introduce monthly dummies to control for seasonaleffects on churn, such as promotions offered by competitorsover the summer or during Christmas.We count the number of friends who churn that exchangeat least n calls with the ego, as explained in the previoussection. We vary n in { , , } and use frd churn to de-note the covariate of interest. Table 2 shows the resultsobtained. The coefficients on frd churn are statisticallysignificant across all model specifications. Note that coeffi-cients for n = 5 are larger but this can only mildly hint at the fact that churn from stronger friends is more relevant forthe ego because standard errors are high. The other covari-ates behave as expected, in particular, higher expenditureleads to more churn. Yet, we recall that our goal in this pa-per is not to develop a predictive model of churn. Instead,we are interested in identifying the role of peer influence onchurn. The additional covariates in our study are not nec-essarily used to predict churn. Instead, they are used tocharacterize the behavior of consumers with respect to howthey use cellphone service so that we can, in the next sec-tion, compare similar users to try to isolate the effect of peerinfluence.These results show that the more friends churn the morelikely the ego will churn. For example, 0.375 for the firstmodel shown in this table indicates that if one more friendchurns then the ego’s likelihood of churn vs. no churn in-creases by exp (0 . exp (0 . exp (0 . frd churn estimated using the six Cox PHmodels in this table. Figure 1 shows how the relative hazardincreases with the number of friends that churn.
5. EFFECT OF FRIENDS’ CHURN igure 1: Simulated average churn hazards for mod-els 1-3 in Table 2. Ribbons represent 95% confidenceintervals. 10,000 simulations were run per value of frd churn
Propensity Score Matching (PSM) is a widely applied methodto evaluate the effect of treatments on outcomes of interest[27, 23] with observational studies. In particular, when theassignment of treatment is not random, selection bias arisebecause the characteristics of treated and untreated unitscan differ. The propensity score, defined as the probabilityof receiving treatment conditional on observed confoundingcovariates that correlate with both the outcome and treat-ment, summarizes all the relevant information available tothe researcher into a single scalar value. Conditioning onthis value, the distributions of the observed characteristicsacross treated and untreated units become more similar and,therefore, unlikely to drive differences in outcomes. Hence,PSM helps reduce bias. However, PSM still fails to pro-vide full causal interpretations because it does not controlfor unobserved effects [29]. In any case, and when random-ized experiments are unavailable, using PSM increases ourconfidence when reporting effects, in particular, when onecontrols for the most important characteristics of the unitsunder analysis.Extensions of PSM to cases with continuous treatments [15,12] have been proposed in the literature under the umbrellaof Generalized Propensity Score (GPS). GPS provides adose-response function (DRF) that measures the relation-ship between the outcome of interest and the intensity oftreatment. In our case, friends’ churn (the treatment) is notbinary but rather an integer. Egos can be subject to differ-ent amounts of treatment as they see more or fewer friendschurn. Different treatment intensities can have different ef-fects on the ego’s churn (the outcome). Therefore, we useGPS to explore our case of contagious churn in more detail.Details of the mechanics beyond GPS can be found in [12].For sake of space, we only note the following modeling op-tions: i) the distribution of friends’ churn is far from normal,therefore, we followed a parametric solution, as proposed in[10], and allow the intensity of treatment to be skewed to-wards zero, using an exponential functional form whose pa-rameters are estimated by maximum likelihood; ii) we use apolynomial approximation of order two to regress outcomeson treatments and propensity scores, from which we com- pute the average conditional expectation for the effect oftreatment. This polynomial approximation of degree two,with an interaction term between treatment intensity andpropensity score, allows for a better understanding of thenon-linear effects of treatment than linear regression.
Our dataset spans 10 months of data. However, we need toobserve subscribers for 3 months to determine whether theychurn. So, in fact, we are limited to a panel with 7 time pe-riods. As discussed earlier, contagious churn is studied afterthe exposure to friends’ churn friend during an inductionand latency period. Therefore, we split the period of anal-ysis into two intervals: i) the Treatment Exposure Period(TEP), during which egos observe, and count, their friendschurning, represented by frd churn ; ii) the Post-TreatmentPeriod (PTP), during which we observe whether the egochurns. There are several options to define TEP and PTP.However, we need each of these intervals to be sufficientlylarge. TEP needs to be sufficiently large so that we canobserve treatment and, in particular, several intensities oftreatment. PTP also needs to be sufficiently large so thatwe can observe outcomes and, in particular, outcomes trig-gered by treatment. Therefore, natural choices for TEP andPTP are to include months { , , , } in the former andmonths { , , } in the latter. Another option is to includemonths { , , } in TEP and months { , , , } in PTP. Be-low we show results using the former definitions. Results forthe latter are qualitatively similar and are available uponrequest. We consider 3 levels of treatment intensity: i) T1: at mostone friend churns; ii) T2: 2 or 3 friends churn; iii) T3:more than 3 friends churn. We compute a propensity scorefor each subscriber and for each treatment intensity. Thenwe test whether the conditional means of the subscribers’covariates given the propensity score are different for sub-scribers with each treatment intensity and subscribers withother treatment intensities. If the latter are similar thensubscribers with different treatment intensities are not dif-ferent in other aspects of their behavior, which allows us tobetter associate changes in the outcome (the ego’s churn) tochanges in the treatment (friends’ churn).The key for any GPS analysis to identify believable effectsis to control for the important covariates. Given the rich-ness of our dataset, we are fortunate to observe covariatesthat, allegedly, capture most of what is important to controlfor to study contagious churn. First, we control for tenurewith EURMO, thus making sure that we take the life cycleof the subscriber within EURMO into account. Otherwise,it would be unreasonable to compare new consumers to oldconsumers as the latter typically have developed a differ-ent level of trust with EURMO, enjoy different prices and,most importably for the issue of churn, experience differ-ent switching costs because their lock-in periods are morelikely to have expired. Second, we control for number ofcalls placed and for the percentage of calls to other net-works. These covariates together capture well the level ofcell usage. Otherwise, it would be unreasonable to compareubscribers with low usage to subscribers with high usageas their engagement with cell phone service is likely verydifferent.Third, we control for expenditure. While this covariate ishighly correlated to usage, controlling for expenditure allowsfor reducing selection bias introduced by having subscriberschoose their tariff plan. Indeed, consumers with differentlevels of income, and different tariff plans, can pay differentamounts of money for the same level of call usage. Such dif-ferences could, therefore, be attributed to usage of servicesother than calling, such as data. Ensuring that both num-ber of calls and expenditure are similar across subscribersallows us to be more confident that we are comparing sub-scribers that use their cell phone similarly, even for serviceswhose usage we do not observe in our dataset. Finally, wecontrol for the number of friends to make sure that we com-pare subscribers with similar social circles. Otherwise, itwould be unreasonable to compare subscribers that can re-ceive many signals (friends churning) from many friends withsubscribers with very few friends can only obtain very lim-ited signals.We generate results for n - callfrd churn for n = 1 , ,
5. Ta-ble 3 shows how the adjustment by conditioning on thepropensity score balances these covariates for the case of n = 1. We can see that most covariates are different be-fore adjustment but become statistically similar at the 5%leave after adjustment. Balancing for other values of n andfor ( T EP, P T P ) = ( { , , } , { , , , } ) yield qualitativelysimilar results, that is, biases are significantly reduced afteradjustment. Before adjustment After adjustmentCovariates [T1] [T2] [T3] [T1] [T2] [T3] n calls expenditure frd call other -0.98 -1.26 0.84 -0.36 -0.83 tenure -1.15 -2.04 Table 3: Balance in subscribers’ covariates for - callfrd churn introduced by conditioning on the gener-alized propensity score. Bold values indicate covari-ates that are different (at the 5% level) between thattreatment intensity and the other treatment inten-sities. We use GPS to estimate the conditional mean for the ef-fect of treatment, friends’ churn, on the outcome, the ego’schurn. We do not report the regression results here, forsake of space and because the estimated coefficients haveno direct meaning [12]. Still, we notice that the coefficientsfrom regressing outcomes on our second degree polynomialon treatment intensities and propensity scores are all sta-tistically different from zero at least at the 10% level. Weestimate the dose response function and derive the marginaltreatment effect relative to having no friends that churn. Wereport marginal treatment effects up to five friends churning,which covers 99.9% of the observations in our dataset. Figure 2 shows the results obtained for n = 1 , ,
5. We ob-serve that having more friends churn increases the likelihoodof churn for any n considered in our analysis. This providesevidence of peer influence in churn in wireless networks. Fur-thermore, we observe that when considering the churn from3-call and 5-call friends the marginal effect of treatment,that is the effect of having one more of these friends churn,increases with the number of friends that churn. This pro-vides some evidence that churn from stronger friends mightbe more important. In particular, with GPS, we see that fortreatment intensity T3 (that is, more than 3 friends churn),the 95% confidence interval for 1-call no longer overlaps withthe 95% confidence interval for 5-call, which allows us toclaim that the effect of 5-call friends’ churn has a large ef-fect than that of 1-call friends’ churn when enough friendschurn. This is a sensible result showing that enough strongfriends churning makes a significant difference on the ego’sprobability to churn.Finally, note that the marginal effect from the GPS analysisis not directly comparable to the relative hazard from thesurvival analysis. Yet, we can see that the effect of oneadditional friend churning is at most 10% in the GPS results.This means that homophily is likely to play a significant rolein the correlation of churn across friends that the survivalanalysis confounds with peer influence. GPS reduces thebias in estimating the effect of peer influence by comparingacross similar subscribers. This takes away the effect of allthe homophily captured by the covariates that we controlfor in computing the propensity score. Figure 2: Estimated average marginal treatment ef-fect with TEP: { } and PTP: { } relativeto having no friends churning. Ribbons representthe 95% confidence intervals. Standard errors areobtained via bootstrapping (100 repetitions)
6. CONCLUSIONS
Retaining existing subscribers is of vital importance for wire-less carriers to survive in today’s dynamic and competitivemobile market. As such, understanding the determinantsof churn becomes a priority. Carriers need to confidentlyidentify potential churners to apply appropriate retentionstrategies aimed at reducing subscriber loss. However, theperplexing and evolving nature of churn still poses significantchallenges to churn managers. In this paper, we look at onef such complexities. We examine the effect of peer influenceon churn. We do so empirically using a real world datasetof cell phone activity, from which we extract a set of co-variates measuring cell phone usage, tenure with the carrierand network structural properties such as number of friends.We analyze a random sample of roughly 10 thousand sub-scribers. For each subscriber in our sample we define thenumber of friends that churn with whom the subscriber ex-changes 1, 3 or 5 calls in the same calendar month. Thesedifferent definitions allow for estimating the effect of churnfrom strong vs. weak friends.In a preliminary analysis, we fit survival models to our datato correlate the cumulative number of friends that churnedup to the previous month and the ego’s likelihood of churnin the current month. An additional friend that churns in-creases the hazard of churn in at least 24%. In a deeper anal-ysis, we use Generalized Propensity Score (GPS) matchingto help reduce the potential selection bias across subscribersin our sample and estimate the effect of contagion. WithGPS we can compare subscribers that are similar in a num-ber of relevant covariates and differ only in the intensity oftheir treatment. The latter is the number of friends thatchurned. Some subscribers do not have friends that churnedwhile others see as much as 5 friends churn over a period of7 months. We control for the subscribers tenure with thecarrier, cell phone usage, monthly expenditure and the sizeof the social network. These are the relevant covariates tocapture the subscribers behavior with respect to cell phoneservice and choice of provider. In this framework, we asso-ciate differences in the intensity of treatment to differencesin the outcome of interest. In our case, the latter is the ego’spropensity to churn.We find that the ego’s propensity to churn increases withthe number of friends that churn and in particular with in-creased numbers of strong friends churning. With GPS, wefind that the marginal effect of an additional friend churningis at best 10%. This means that homophily is likely to drivea significant amount of the correlation between churn andfriends’ churn that a simple survival analysis is unable todisentangle. With GPS we show that the role of peer influ-ence is still significant in wireless networks: when someonechurns from the our carrier, other people may do so due topeer influence. This result, which may well represent a lossof 10% in revenues for the carrier due to peer influence inchurn, suggests that churn managers should consider strate-gies aimed at preventing group churn instead of looking atsubscribers on an individual basis. Finally, we stress thatour paper puts together a methodology to measure peer in-fluence in social networks that can be used in other contextssuch as learning about the effect of word-of-mouth in thedissemination of new products or services.
7. ACKNOWLEDGEMENTS
This work was partially supported by the Funda¸c˜ao paraa Ci˚encia e a Tecnologia (Portuguese Foundation for Sci-ence and Technology) through the Carnegie Mellon PortugalProgram under Grant SFRH/BD/51153/2010. We are alsograteful for the comments and suggestions of three anony-mous referees. We thank EURMO and Pavel Krivitsky forproviding and organizing data that allowed us to performthis analysis.
8. REFERENCES [1] A. Anagnostopoulos, R. Kumar, and M. Mahdian.Influence and correlation in social networks. In
Proceedings of the 14th ACM SIGKDD InternationalConference on Knowledge discovery and data mining ,KDD ’08, pages 7–15. ACM, 2008.[2] S. Aral, L. Muchnik, and A. Sundarajan.Distinguishing influence-based contagion fromhomophily-driven diffusion in dynamic networks.
Proceedings of National Academy of Science ,106(51):21544–21549, 2009.[3] A. Berson, S. Smith, and K. Thearling.
Building DataMining Applications for CRM . Enterprise ComputingSeries. McGraw-Hill Osborne, 2000.[4] Y. Bramoull´e, H. Djebbari, and B. Fortin.Identification of peer effects through social networks.
Journal of Econometrics , 150(1):41–55, 2009.[5] C. Daegon, P. Ferreira, and R. Telang. The impact ofmobile number portability on switching costs andpricing strategy. In
Proceedings of the 40thTelecommunications Policy Research Conference ,TPRC ’12, 2012.[6] K. Dasgupta, R. Singh, B. Viswanathan,D. Chakraborty, S. Mukherjea, A. A. Nanavati, andA. Joshi. Social ties and their relevance to churn inmobile telecom networks. In
Proceedings of the 11thInternational Conference on Extending databasetechnology: Advances in database technology , EDBT’08, pages 668–677. ACM, 2008.[7] T. Dierkes, M. Bichler, and R. Krishnan. Estimatingthe effect of word of mouth on churn and cross-buyingin the mobile phone market with markov logicnetworks.
Decision Support Systems , 51(3):361–371,2011.[8] FCC. Annual report and analysis of competitivemarket conditions with respect to commercial mobileservices. WT Docket 08-27, Federal CommunicationCommission, 2009.[9] N. Glady, B. Baesens, and C. Croux. Modeling churnusing customer lifetime value.
European Journal ofOperational Research , 197(1):402–411, 2009.[10] B. Guardabascio and M. Ventura. Estimating thedose-response function through the GLM approach.Munich Personal RePEc Archive 45013, 2013.[11] S. Gupta, D. Lehmann, and J. Stuart. Valuingcustomer.
Journal of Marketing Research , 41(1):7–18,2004.[12] K. Hirano and G. W. Imbens. The propensity scorewith continuous treatments. In A. Gelman and X.-L.Meng, editors,
Applied Bayesian Modeling and CausalInference from Incomplete-Data Perspectives , pages73–84. John Wiley & Sons, Ltd, 2005.[13] S.-Y. Hung, D. C. Yen, and H.-Y. Wang. Applyingdata mining to telecom churn management.
ExpertSystems with Applications , 31(3):515 – 524, 2006.[14] H. Hwang, T. Jung, and S. Euiho. An LTV model andcustomer segmentation based on customer value: acase study on the wireless telecommunication industry.
Expert Systems with Applications , 26(2):181–188, 2004.[15] K. Imai and D. Van Dyk. Causal inference withgeneral treatment regimes: Generalizing thepropensity score.
Journal of American Statisticalssociation , 99(467):854–866, 2004.[16] ITU. The world in 2013: ICT facts and figures, 2013.[17] H.-S. Kim and C.-H. Yoon. Determinants of subscriberchurn and customer loyalty in the korean mobiletelephony market.
Telecommunication Policies ,28(9):751–765, 2004.[18] P. Krivitsky, P. Ferreira, and R. Telang. Networkneighbor effects on customer churn in cell phonenetworks. In
Proceedings of the 7th Symposium onStatistical Challenges in E-Commerce Research ,SCECR ’11, 2011.[19] T. La Fond and J. Neville. Randomization tests fordistinguishing social influence and homophily effects.In
Proceedings of the 19th International Conference onWorld Wide Web , WWW ’10, pages 601–610, NewYork, NY, USA, 2010. ACM.[20] K. Lewis, M. Gonzalez, and J. Kaufman. Socialselection and peer influence in an online socialnetwork.
Proceedings of National Academy of Science ,109(1):68–72, 2012.[21] M. Matos, P. Ferreira, and D. Krackhardt. Peerinfluence in the diffusion of the iPhone 3G over a largesocial network.
MIS Quarterly , (to appear in print),2014.[22] M. McPherson, L. Smith-Lovin, and J. Cook. Birds ofa feather: Homophily in social networks.
AnnualReview of Sociology , 27:415–444, 2001.[23] S. Mogan and C. Winship.
Counterfactuals andCausal Inference: Methods and Principles for SocialResearch (Analyitcal Methods for Social Research) .Cambridge University Press, New York NY, 2007.[24] M. Mozer, R. Wolniewicz, D. Grimes, E. Johnson, andH. Kaushansky. Predicting subscriber dissatisfactionand improving retention in the wirelesstelecommunications industry.
IEEE Transactions onNeural Networks , 11(3):690–696, 2000.[25] E. Ngai, L. Xiu, and D. Chau. Application of datamining techniques in customer relationshipmanagement: A literature review and classification.
Expert Systems with Applications , 36(2, Part 2):2592 –2602, 2009.[26] C. Phadke, H. Uzunalioglu, V. Mendiratta,D. Kushnir, and D. Doran. Prediction of subscriberchurn using social network analysis.
Bell LabsTechnical Journal , 17(4):63–75, 2013.[27] P. Rosenbaum and R. Rubin. The central role of thepropensity score in observational studies for causaleffects.
Biometrika , 70(1):41–55, 1983.[28] S. Rosset, E. Neumann, U. Eick, N. Vatnik, andY. Idan. Customer lifetime value modeling and its usefor customer retention planning. In
Proceedings of the8th ACM SIGKDD international Conference onKnowledge discovery and data mining , KDD ’02, pages332–340. ACM, 2002.[29] C. Shalizi and A. Thomas. Homophily and contagionare generically confounded in observational socialnetwork studies.
Sociological Methods and Research ,40(2):211–239, 2011.[30] C. Steglich, T. Snijders, and M. Pearson. Dynamicnetworks and behavior: Separating selection frominfluence.
Sociological Methodology , 40(1):329–393,2010. [31] W. Verbeke, K. Dejaeger, D. Martens, J. Hur, andB. Baesens. New insights into churn prediction in thetelecommunication sector: A profit driven data miningapproach.
European Journal of Operational Research ,218(1):211 – 229, 2012.[32] W. Verbeke, D. Martens, C. Mues, and B. Baesens.Building comprehensible customer churn predictionmodels with advanced rule induction techniques.
Expert Systems with Applications , 38(3):2354–2364,2011.[33] C.-P. Wei and I.-T. Chiu. Turning telecommunicationscall details to churn prediction: a data miningapproach.