Scheduling a Cascade with Opposing Influences
aa r X i v : . [ c s . G T ] N ov Scheduling a Cascade with Opposing Influences
MohammadTaghi Hajiaghayi, Hamid Mahini, and Anshul Sawant
University of Maryland at College Park {hajiagha,hmahini,asawant}@cs.umd.edu
Abstract.
Adoption or rejection of ideas, products, and technologies in a soci-ety is often governed by simultaneous propagation of positive and negative in-fluences. Consider a planner trying to introduce an idea in different parts of asociety at different times. How should the planner design a schedule consideringthis fact that positive reaction to the idea in early areas has a positive impact onprobability of success in later areas, whereas a flopped reaction has exactly theopposite impact? We generalize a well-known economic model which has beenrecently used by Chierichetti, Kleinberg, and Panconesi (ACM EC’12). In thismodel the reaction of each area is determined by its initial preference and thereaction of early areas. We model the society by a graph where each node repre-sents a group of people with the same preferences. We consider a full propagationsetting where news and influences propagate between every two areas. We gen-eralize previous works by studying the problem when people in different areashave various behaviors.We first prove, independent of the planner’s schedule, influences help (resp., hurt)the planner to propagate her idea if it is an appealing (resp., unappealing) idea.We also study the problem of designing the optimal non-adaptive spreading strat-egy. In the non-adaptive spreading strategy, the schedule is fixed at the beginningand is never changed. Whereas, in adaptive spreading strategy the planner decidesabout the next move based on the current state of the cascade. We demonstratethat it is hard to propose a non-adaptive spreading strategy in general. Never-theless, we propose an algorithm to find the best non-adaptive spreading strat-egy when probabilities of different behaviors of people in various areas drawni.i.d from an unknown distribution. Then, we consider the influence propagationphenomenon when the underlying influence network can be any arbitrary graph.We show it is P -complete to compute the expected number of adopters for agiven spreading strategy. However, we design a polynomial-time algorithm forthe problem of computing the expected number of adopters for a given schedulein the full propagation setting. Last but not least, we give a polynomial-time algo-rithm for designing an optimal adaptive spreading strategy in the full propagationsetting. Keywords:
Influence Maximization, Scheduling, Spreading Strategy, Algorithm.
People’s opinions are usually formed by their friends’ opinions. Whenever a new con-cept is introduced into a society, the high correlation between people’s reactions initiatesan influence propagation. Under this propagation, the problem of promoting a product
MohammadTaghi Hajiaghayi, Hamid Mahini, and Anshul Sawant or an opinion depends on the problem of directing the flow of influences. As a result, aplanner can develop a new idea by controlling the flow of influences in a desired way.Although there have been many attempts to understand the behavior of influence propa-gation in a social network, the topic is still controversial due to lack of reliable informa-tion and complex behavior of this phenomenon. For example, one compelling approachis “seeding” which was introduced by the seminal work of Kempe, Kleinberg, and Tra-dos [1] and is well-studied in the literature [1,2,3]. The idea is to influence a group ofpeople in the initial investment period and spread the desired opinion in the ultimate ex-ploitation phase. Another approach is to use time-varying and customer-specific pricesto propagate the product (see e.g., [4,5,6]). All of these papers investigate the influencepropagation problem when only positive influences spread into the network. However,in many real world applications people are affected by both positive and negative influ-ences, e.g., when both consenting and dissenting opinions broadcast simultaneously.We generalize a well-known economic model introduced by Arthur [7]. This modelhas been recently used by Chierichetti, Kleinberg, and Panconesi [8]. Assume an orga-nization is going to develop a new idea in a society where the people in the society aregrouped into n different areas. Each area consists of people living near each other withalmost the same preferences. The planner schedules to introduce a new idea in differentareas at different times. Each area may accept or reject the original idea. Since areasare varied and effects of early decisions boost during the diffusion, a schedule-basedstrategy affects the spread of influences. This framework closely matches to variousapplications from economics to social science to public health where the original ideacould be a new product, a new technology, or a new belief.Consider the spread of two opposing influences simultaneously. Both positive andadverse reactions to a single idea originate different flows of influences simultaneously.In this model, each area has an initial preference of Y or N . The initial preference of Y ( N ) means the area will accept (decline) the original idea when there are no networkexternalities. Let c i be a non-negative number indicating how reaction of people in area i depends on the others’. We call c i the threshold of area i . Assume the planner introducedthe idea in area i at time s . Let m Y and m N be the number of areas which accept orreject the idea before time s . If | m Y − m N | ≥ c i the people in area i decide based onthe majority of previous adopters. It means they adopt the idea if m Y − m N ≥ c i anddrop it if m N − m Y ≥ c i . Otherwise, if | m Y − m N | < c i the people in area i accept orreject the idea if the initial preference of area i is Y or N respectively. The planner doesnot know exact initial preferences and has only prior knowledge about them. Formallyspeaking, for area i the planner knows the initial preference of area i will be Y withprobability p i and will be N with probability − p i . We call p i the initial acceptanceprobability of area i .We consider the problem when the planner classifies different areas into varioustypes. The classification is based on the planner’s knowledge about the reaction of peo-ple living in each area. Hence, the classification is based on different features, e.g.,preferences, beliefs, education, and age such that people in areas with the same typereact almost the same to the new idea. It means all areas of the same type have the samethreshold c i and the same initial acceptance probability p i . It is worth mentioning pre-vious works only consider the problem when all areas have the same type, i.e., all p i ’s cheduling a Cascade with Opposing Influences 3 and c i ’s are the same [7,8]. The planner wants to manage the flow of influences, and her spreading strategy is a permutation π over different areas. Her goal is to find a spread-ing strategy π which maximizes the expected number of adopters. We consider both adaptive and non-adaptive spreading strategies in this paper. In the adaptive spreadingstrategy, the planner can see results of earlier areas for further decisions. On the otherhand, in the non-adaptive spreading strategy the planner decides about the permutationin advance. We show the effect of a spreading strategy on the number of adopters withan example in Appendix A. We are motivated by a series of well-known studies in economics and politics literaturein order to model people’s behavior [7,9,10,11]. Arthur first proposed a frameworkto analyze people’s behavior in a scenario with two competing products [7]. In thismodel people are going to decide about one of two competing products alternatively. Hestudied the problem when people are affected by all previous customers, and the plannerhas the same prior knowledge about people’s behavior, i.e., people have the same types.He demonstrated that a cascade of influences is formed when products have positivenetwork externalities, and early decisions determine the ultimate outcome of the market.It has been showed the same cascade arises when people look at earlier decisions, notbecause of network externalities, but because they have limited information themselvesor even have bounded rationality to process all available data [9,10].Chierichetti, Kleinberg, and Panconesi argued when relations between people forman arbitrary network, the outcome of an influence propagation highly depends on the or-der in which people make their decisions [8]. In this setting, a potential spreading strat-egy is an ordering of decision makers. They studied the problem of finding a spreadingstrategy which maximizes the expected number of adopters when people have the sametype, i.e., people have the same threshold c and the same initial acceptance probability p . They proved for any n -node graph there is an adaptive spreading strategy with at least O ( np c ) adopters. They also showed for any n -node graph all non-adaptive spreadingstrategies result in at least (resp. at most) n if initial acceptance probability is less (resp.greater) than . They considered the problem on an arbitrary graph when nodes havethe same type. While we mainly study the problem on a complete graph when nodeshave various types, we improve their result in our setting and show the expected numberof adopters for all adaptive spreading strategies is at least (resp. at most) np if initialacceptance probability is p ≥ (resp. p ≤ ). We also show the problem of designingthe best spreading strategy is hard on an arbitrary graph with several types of customers.We prove it is P -complete to compute the expected number of adopters for a givenspreading strategy.The problem of designing an appropriate marketing strategy based on network ex-ternalities has been studied extensively in the computer science literature. For example,Kempe, Kleinberg, and Tardos [1] studied the following question in their seminal work:How can we influence a group of people in an investment phase in order to propagate anidea in the exploitation phase? This question was introduced by Domingos and Richard-son [12]. The answer to this question leads to a marketing strategy based on seeding.There are several papers that study the same problem from an algorithmic point of view, MohammadTaghi Hajiaghayi, Hamid Mahini, and Anshul Sawant e.g., [2,3,13]. Hartline, Mirrokni, and Sundararajan [6] also proposed another market-ing strategy based on scheduling for selling a product. Their marketing strategy is apermutation π over customers and price p i for customer i . The seller offers the productwith price p i to customer i at time t where t = π − ( i ) . The goal is to find a marketingstrategy which maximizes the profit of the seller. This approach is followed by severalworks, e.g., [4,5,14]. These papers study the behavior of an influence propagation whenthere is only one flow on influences in the network. In this paper, we study the problemof designing a spreading strategy when both negative and positive influences propagatesimultaneously.The propagation of competitive influences has been studied in the literature (See[15] and its references). These works studied the influence propagation problem in thepresence of competing influences, i.e., when two or more competing firms try to propa-gate their products at the same time. However we study the problem of influence prop-agation when there exist both positive and negative reactions to the same idea. Thereare also studies which consider the influence propagation problem in the presence ofpositive and negative influences [16,17]. Che et al. [16] use a variant of the indepen-dent cascade model introduced in [1]. They model negative influences by allowing eachperson to flips her idea with a given probability q . Li et al. [17] model the negative in-fluences by negative edges in the graph. Although they study the same problem, we usedifferent models to capture behavior of people. We analyze an influence propagation phenomenon where two opposing flows of influ-ences propagate through a social network. As a result, a mistake in the selection ofearly areas may result in propagation of negative influences. Therefore a good under-standing of influence propagation dynamics seems necessary to analyze the propertiesof a spreading strategy. Besides the previous papers which have studied the problemwith just one type [7,8], we consider the scheduling problem with various types. Also,we mainly study the problem in a full propagation setting as it matches well to ourmotivations. In the full propagation setting news and influences propagate between ev-ery two areas. One can imagine how internet, media, and electronic devices broadcastnews and influences from everywhere to everywhere. In the partial propagation settingnews and influences do not necessarily propagate between every two areas. In the par-tial propagation setting the society can be modeled with a graph, where there is an edgefrom area i to area j if and only if influences propagate from area i to area j .Our main focus is to analyze the problem when the planner chooses a non-adaptivespreading strategy. Consider an arbitrary non-adaptive spreading strategy when initialpreferences of all areas are p . The expected number of adopters is exactly np if all areasdecide independently. We demonstrate that in the presence of network influences, theexpected number of adopters is greater/less than np if initial acceptance probability p is greater/less than . These results have a bold message: The influence propagationis an amplifier for an appealing idea and an attenuator for an unappealing idea.
Chierichetti, Kleinberg, and Panconesi [8] studied the problem on an arbitrary graphwith only one type. They proved the number of adopters is greater/less than n if initial cheduling a Cascade with Opposing Influences 5 acceptance probability p is greater/less than . Theorem 1 improves their result from n to np in our setting. All missing proofs are in the full version of the paper. Theorem 1.
Consider an arbitrary non-adaptive spreading strategy π in the full prop-agation setting. Assume all initial acceptance probabilities are equal to p . If p ≥ ,then the expected number of adopters is at least np . Furthermore, If p ≤ , then theexpected number of adopters is at most np . Chierichetti, Kleinberg, and Panconesi [8] studied the problem of designing an op-timum spreading strategy in the partial propagation setting. They design an approxima-tion algorithm for the problem when the planner has the same prior knowledge aboutall areas, i.e., all areas have the same type. We study the same problem with more thanone type. We first consider the problem in the full propagation setting. One approachis to consider a non-adaptive spreading strategy with a constant number of switchesbetween different types. The planner has the same prior knowledge about areas withthe same type. It means areas with the same type are identical for the planner. Thusany spreading strategy can be specified by types of areas rather than areas themselves.Let τ ( i ) be the type of area i and τ ( π ) be the sequence of types for spreading strategy π . For a given spreading strategy π a switch is a position k in the sequence such that τ ( π ( k )) = τ ( π ( k + 1)) . As an example consider a society with areas. Areas and are of type . Areas and are of type . Then spreading strategy π = (1 , , , with τ ( π ) = (1 , , , has a switch at position and spreading strategy π = (1 , , , with τ ( π ) = (1 , , , has switches at positions , , and . Theorem 2. A σ -switch spreading strategy is a spreading strategy with at most σ switches. For any constant σ , there exists a society with areas of two types such thatno σ -switch spreading strategy is optimal. We construct a society with n areas with n areas of type and n areas of type . We demonstrate an optimal non-adaptive spreading strategy should switch at least Ω ( n ) times. It means no switch-based non-adaptive spreading strategy can be optimal.We prove Theorem 2 formally in Appendix B.On the positive side, we analyze the problem when thresholds are drawn indepen-dently from an unknown distribution and initial acceptance probabilities are arbitrarynumbers. We characterize the optimal non-adaptive spreading strategy in this case. Theorem 3.
Assume that the planner’s prior knowledge about all values of c i ’s is thesame, i.e., all c i ’s are drawn independently from the same but unknown distribution.Let initial acceptance probabilities be arbitrary numbers. Then, the best non-adaptivespreading strategy is to order all areas in non-increasing order of their initial accep-tance probabilities. We also study the problem of designing the optimum spreading strategy in the par-tial propagation setting with more than one types. We show it is hard to determinethe expected number of adopters for a given spreading strategy. Formally speaking,we show it is P -complete to compute the expected number of adopters for a givenspreading strategy π in the partial propagation setting with more than one type. Thisis another evidence to show the influence propagation is more complicated with morethan one type. We prove Theorem 4 based on a reduction from a variation of the networkreliability problem in Appendix C. MohammadTaghi Hajiaghayi, Hamid Mahini, and Anshul Sawant
Theorem 4.
In the partial propagation setting, it is P -complete to compute the ex-pected number of adopters for a given non-adaptive spreading strategy π . We also present a polynomial-time algorithm to compute the expected number ofadopters for a given non-adaptive spreading strategy in a full propagation setting. Wedesign an algorithm in order to simulate the amount of propagation for a given spreadingstrategy in Appendix D.
Theorem 5.
Consider a full propagation setting. The expected number of adopter canbe computed in polynomial time for a given non-adaptive spreading strategy π . At last we study the problem of designing the best adaptive spreading strategy. Weovercome the hardness of the problem and design a polynomial-time algorithm to findthe best adaptive marketing strategy in the following theorem. We describe the algo-rithm precisely in Appendix E.
Theorem 6.
A polynomial-time algorithm finds the best adaptive spreading strategyfor a society with a constant number of types.
In this section we define basic concepts and notation used throughout this paper. Wefirst formally define the spread of influence through a network as a stochastic pro-cess and then give the intuition behind the formal notation. We are given a graph G = ( V, E ) with thresholds, c v ∈ Z > , ∀ v ∈ V and initial acceptance probabilities p v ∈ [0 , , ∀ v ∈ V . Let | V | = n . Let d v be the degree of vertex v . Let N ( v ) be theset of neighboring vertices of v . Let c be the vector ( c , . . . , c n ) and p be the vector ( p , . . . , p n ) . Given a graph G = ( V, E ) and a permutation π : V V , we define adiscrete stochastic process, IS (Influence Spread) as an ordered set of random variables ( X , X , . . . , X n ) , where X t ∈ Ω = {− , , } n , ∀ t ∈ { , . . . , n } . The random vari-able X tv denotes decision of area v at time t . If it has not yet been scheduled, X tv = 0 .If it accepts the idea then X tv = 1 , and if it rejects the idea then X tv = − . Note that X tv = 0 iff t < π − ( v ) . Let D ( v ) = P u ∈ N ( v ) X π − ( v ) u be the sum of decision’s of v ’sneighbors. For simplicity in notation, we denote X nv by X v .We now briefly explain the intuition behind the notation. The input graph modelsthe influence network of areas on which we want to schedule a cascade, with eachvertex representing an area. There is an edge between two vertices if two correspondingareas influence each others decision. The influence spread process models the spread ofidea acceptance and rejection for a given spreading strategy. The permutation π mapsa position in spreading strategy to an area in V . For example, π (1) = v implies that v is the first area to be scheduled. Once the area v is given a chance to accept or rejectthe idea at time π − ( v ) , X π − ( v ) v is assigned a value based on v ’s decision and at alltimes t after π − ( v ) , X tv = X π − ( v ) v . The random variable X v denotes whether an area v accepted or rejected the idea. We note that X tv = X v , ∀ t ≥ π − ( v ) . The randomvariable X t is complete snapshot of the cascade process at time t . The variable D ( v ) isthe decision variable for v . It denotes the sum of decisions of v ’s neighbors at the time cheduling a Cascade with Opposing Influences 7 v is scheduled in the cascade and it determines whether v decides to follow the majoritydecision or whether v decides based on its initial acceptance probability. The randomvariable I t is the sum of decisions of all areas at time t . Thus, I n is the variable we areinterested in as it denotes the difference between number of people who accept the ideaand people who reject the idea.Let v = π ( t ) . Given X t − , X t is defined as follows: – Every area decides to accept or reject the idea exactly once when it is scheduledand its decision remains the same at all later times. Therefore ∀ i = π ( t ) : • X ti = X t − i – Decision of area v is based on decision of previous areas if its threshold is reached. • X tv = 1 if D ( v ) ≥ c v • X tv = − if D ( v ) ≤ − c v – If threshold of area v is not reached, then it decides to accept the idea with proba-bility p v , its initial acceptance probability, and decides to reject it with probability − p v .In partial propagation setting, we represent such a stochastic process by tuple IS =( G, c , p , π ) . For full propagation setting, the underlying graph is a complete graph andhence we can denote the process by ( c , p , π ) . When c and p are clear from context, wedenote the process simply by spreading strategy, π . We define random variable I t = P v ∈ V X tv . We denote by q v = 1 − p v the probability that v rejects the idea based oninitial preference. We denote by P r ( A ; IS ) , the probability of event A occurring understochastic process IS . Similarly, we denote by E ( z ; IS ) , the expected value of randomvariable z under the stochastic process IS . Lets call an idea unappealing if its initial acceptance probability for all areas is p forsome p ≤ . We prove in this section, that for such ideas, no strategy can boost theacceptance probability for any area above p . We note that exactly the opposite argumentcan be made when p ≥ is the initial acceptance probability of all areas, i.e., anyspreading strategy guarantees that every area accepts the idea with probability of atleast p . Theorem 1.
Consider an arbitrary non-adaptive spreading strategy π in the full prop-agation setting. Assume all initial acceptance probabilities are equal to p . If p ≥ ,then the expected number of adopters is at least np . Furthermore, If p ≤ , then theexpected number of adopters is at most np .Proof. We prove this result for the case when p ≤ . The other case ( p ≤ ) followsfrom symmetry. To avoid confusion, we let p = p and use p instead of the realnumber p throughout this proof. If we prove that any given area accepts the idea withprobability of at most p , then from linearity of expectation, we are done. Consider an MohammadTaghi Hajiaghayi, Hamid Mahini, and Anshul Sawant area v scheduled at time t + 1 . The probability that the area accepts or rejects the ideais given by P r ( X v = 1) = p (1 − P r ( I t ≥ c v ) − P r ( I t ≤ − c v )) + P r ( I t ≥ c v ) ,P r ( X v = −
1) =(1 − p )(1 − P r ( I t ≥ c v ) − P r ( I t ≤ − c v )) + P r ( I t ≤ − c v ) . Since
P r ( X v = 1) + P r ( X v = −
1) = 1 , if we prove that
P r ( X v =1) P r ( X v = − ≤ p − p , thenwe have P r ( X v = 1) ≤ p . We have P r ( X v = 1) P r ( X v = −
1) = p (1 − P r ( I t ≥ c v ) − P r ( I t ≤ − c v )) + P r ( I t ≥ c v )(1 − p )(1 − P r ( I t ≥ c v ) − P r ( I t ≤ − c v )) + P r ( I t ≤ − c v ) . We have: p (1 − P r ( I t ≥ c v ) − P r ( I t ≤ − c v ))(1 − p )(1 − P r ( I t ≥ c v ) − P r ( I t ≤ − c v )) = p − p . We know that for any a, b, c, d, e ∈ R > , if ab ≤ e and cd ≤ e then a + cb + d ≤ e. (1)Therefore, if we prove that P r ( I t ≥ c v ) P r ( I t ≤− c v ) ≤ p − p , we are done. Thus, we can prove thistheorem by proving that P r ( I k ≥ x ) P r ( I k ≤− x ) ≤ p − p for all x ∈ { . . . k } , k ∈ { . . . n } . Weprove this by induction on number of areas. If there is just one area, then that areadecides to accept with probability p (as all initial acceptance probabilities are equal to p ). Assume if the number of areas is less than or equal to n , then P r ( I k ≥ x ) P r ( I k ≤− x ) ≤ p − p for all x ∈ { . . . k } , k ∈ { . . . n } . We prove the statement when there are n + 1 areas.Let par ( n, x ) : N × N
7→ { , } be a function which is if n and x have the sameparity, otherwise. Let v be the area scheduled at time n + 1 . Let ν = par ( n, x ) . Wenow consider the following three cases. Case 1: ≤ x ≤ n − . The event I n +1 ≥ x + 1 is the union of the following twodisjoint events:1. I n ≥ x + 2 , and whatever the n th area decides, I n +1 is at least x + 1 .2. I n = x + ν and n + 1 th area decides to accept.Similarly, the event I n +1 ≤ − x − is the union of the event I n ≤ − x − and theevent — I n = − x − ν and the n + 1 th area rejects the idea. We note that we require the par function because only one of the events I n = x and I n = x + 1 can occur w.p.p.depending on parities of n and x . Thus P r ( I n +1 ≥ x + 1) = P r ( I n ≥ x + 2) + P r ( X v = 1 | I n = x + ν ) P r ( I n = x + ν ) ,P r ( I n +1 ≤ − x −
1) =
P r ( I n ≤ − x −
2) +
P r ( X v = − | I n = − x − ν ) P r ( I n = − x − ν ) . Now, if x + ν ≥ c v , then P r ( X v = 1 | I n = x + ν ) = P r ( X v = − | I n = − x − ν ) = 1 ,otherwise P r ( X v = 1 | I n = x + ν ) = p < − p = P r ( X v = − | I n = − x − ν ) . cheduling a Cascade with Opposing Influences 9 Therefore,
P r ( X v = 1 | I n = x + ν ) ≤ P r ( X v = − | I n = − x − ν ) . Let β = P r ( X v = − | I n = − x − ν ) . Using the above, we have P r ( I n +1 ≥ x + 1) ≤ P r ( I n ≥ x + 2) + βP r ( I n = x + ν ) ,P r ( I n +1 ≤ − x −
1) =
P r ( I n ≤ − x −
2) + βP r ( I n = − x − ν ) . From above, we have f ( β ) = P r ( I n ≥ x + 2) + βP r ( I n = x + ν ) P r ( I n ≤ − x −
2) + βP r ( I n = − x − ν ) ≥ P r ( I n +1 ≥ x + 1) P r ( I n +1 ≥ − x − . (2)The function f ( β ) is either increasing or decreasing and hence has extrema at endpoints of its range. The maxima is ≤ max { P r ( I n ≥ x +2) P r ( I n ≤− x − , P r ( I n ≥ x +2)+ P r ( I n = x + ν ) P r ( I n ≤− x − P r ( I n = − x − ν ) } because β ∈ [0 , . Now P r ( I n ≥ x + 2) + P r ( I n = x + 1) + P r ( I n = x ) = P r ( I n ≥ x ) and P r ( I n ≤ − x −
2) +
P r ( I n = − x − ν ) = P r ( I n ≤ − x ) . Thus f ≤ max { P r ( I n ≥ x +2) P r ( I n ≤− x − , P r ( I n ≥ x ) P r ( I n ≤− x ) } ≤ p − p (from induction hypothesis). From above and(2), P r ( I n +1 ≥ x +1) P r ( I n +1 ≤− x − ≤ p − p . Case 2: x = 0 . If n is odd then P r ( I n +1 ≥
1) =
P r ( I n +1 ≥ and P r ( I n +1 ≤−
1) =
P r ( I n +1 ≤ − and this case is the same as x = 1 and hence consideredabove. Thus, assume that n is even. Thus P r ( I n +1 ≥
1) =
P r ( I n ≥
2) +
P r ( X v = 1 | I n = 0) P r ( I n = 0) , (3) P r ( I n +1 ≤ −
1) =
P r ( I n ≤ −
2) +
P r ( X v = − | I n = 0) P r ( I n = 0) . (4)Since, if I n = 0 , then areas decide based on the initial acceptance probability. We have P r ( X v = 1 | I n = 0) = p and P r ( X v = − | I n = 0) = 1 − p . Using this fact ,bydividing (3) and (4), we have P r ( I n +1 ≥ P r ( I n +1 ≤ − ≤ P r ( I n ≥
2) + p P r ( I n = 0) P r ( I n ≤ −
2) + (1 − p ) P r ( I n = 0) . From induction hypothesis,
P r ( I n ≥ P r ( I n ≤− ≤ p − p . Thus, we conclude P r ( I n +1 ≥ P r ( I n +1 ≤− ≤ p − p based on (1). Case 3: x ∈ { n − , n } . In this case P r ( I n ≥ x + 2) = 0 , since the number ofadopters can never be more than the number of total areas. Also, I n +1 cannot be equalto n because n and n + 1 don’t have the same parity. Therefore, P r ( I n +1 ≥ n ) = P r ( I n +1 ≥ n + 1) and P r ( I n +1 ≤ − n ) = P r ( I n +1 ≤ − n − . Thus, it is enough toanalyze the case x = n . We have P r ( I n +1 ≥ n + 1) = P r ( X v = 1 | I n = n ) P r ( I n = n ) ,P r ( I n +1 ≤ n + 1) = P r ( X v = − | I n = − n ) P r ( I n = − n ) . Since either both decisions are made based on thresholds with probability or bothare made based on initial probabilities and initial acceptance probability is less thanthe initial rejection probability, We know that P r ( X v = 1 | I n = n ) ≤ P r ( X v = − | I n = − n ) . Therefore P r ( I n +1 ≥ n +1) P r ( I n +1 ≤ n +1) ≤ P r ( I n = n ) P r ( I n = − n ) . Now, since P r ( I n = n ) = P r ( I n ≥ n ) and P r ( I n = − n ) = P r ( I n ≤ − n ) , from induction hypothesis, we have P r ( I n +1 ≥ n +1) P r ( I n +1 ≤ n +1) ≤ p − p and we are done. We consider the problem of designing a non-adaptive spreading strategy when thethresholds are drawn independently from the same but unknown distribution. We showthe best spreading strategy is to schedule areas in a non-increasing order of initial ac-ceptance probabilities. We prove the optimality of the algorithm using a coupling argu-ment. First we state the following lemma which will be useful in proving Theorem 3.The proof is in Appendix F.1.
Lemma 1
Let π and π ′ be two spreading strategies. If ∃ k ∈ Z > , such that π ( i ) = π ′ ( i ) , ∀ i ≥ k and P r ( I k ≥ x ; π ) ≥ P r ( I k ≥ x ; π ′ ) , ∀ x ∈ Z , then E ( I n ; π ) ≥ E ( I n ; π ′ ) . Theorem 3.
Assume that the planner’s prior knowledge about all values of c i ’s is thesame, i.e., all c i ’s are drawn independently from the same but unknown distribution.Let initial acceptance probabilities be arbitrary numbers. Then, the best non-adaptivespreading strategy is to order all areas in non-increasing order of their initial accep-tance probabilities.Proof. Let π ′ be a spreading strategy where areas are scheduled in an order that is notnon-increasing. Thus, there exists k such that p π ′ ( k ) < p π ′ ( k +1) . We prove that if a newspreading strategy π is created by exchanging position of areas π ′ ( k ) and π ′ ( k + 1) ,then the expected number of people who accept the idea cannot decrease. It means thebest spreading strategy is non-increasing in the initial acceptance probabilites.To prove the theorem, we will prove that P r ( I k +1 ≥ x ; π ) ≥ P r ( I k +1 ≥ x ; π ′ ) and the result then follows from Lemma 1. Since, the two spreading strategies areidentical till time k − and therefore the random variable I k − has identical distri-bution under both the strategies, we can prove the above by proving that P r ( I k +1 ≥ I k − + y | I k − ; π ) ≥ P r ( I k +1 ≥ I k − + y | I k − ; π ′ ) for all y ∈ Z . We note that theonly feasible values for y are in {− , , } . Hence, if y > then both sides of the aboveinequality are equal to and the inequality holds. Similarly, if y < = − both sides ofthe inequality are equal to and the inequality holds. Thus, we only need to analyze thevalues y = 0 and y = 2 .Now we define some notation to help with rest of the proof. Let u = π ′ ( k + 1) , v = π ′ ( k ) , and q i = 1 − p i . It means p v < p u . Let χ ( i, j ) be the event where i and j areindicators of decision of areas scheduled at time k and k + 1 respectively, e.g., χ (1 , means that areas scheduled at time k and k + 1 accepted the idea, whereas χ (1 , − implies that area scheduled at time k accepted the idea, while the area scheduled at time k + 1 rejected the idea. Let B ( y ) be the event I k +1 ≥ I k − + y | I k − = z for somearbitrary z ∈ Z . We consider the cases I k − > , I k − < and I k − = 0 separately. Case 1: I k − = z, z > . We have, B (0) = χ (1 , ∪ χ (1 , − ∪ χ ( − , which isequal to the complement of χ ( − , − . Since we assume z > , the thresholds − c u and − c v cannot be hit. Thus, χ ( − , − occurs only when both areas decide to rejectthe idea based on their respective initial acceptance probabilities. Thus, from chain ruleof probability, it is the product of following four terms:1. P r ( z < c u ) , i.e, the threshold rule does not apply and u decides based on initialacceptance probabilities. cheduling a Cascade with Opposing Influences 11 u rejects the idea based on initial probability of rejection, q u .3. P r ( z − < c v ) . Given u rejected the idea, D ( v ) , the decision variable for v becomes z − and the threshold rule does not apply and v decides based on initialacceptance probabilities.4. v rejects the idea based on initial probability of rejection, q v .Therefore, P r ( χ ( − , − P r ( z < c u ) q u P r ( z − < c v ) q v . Thus, P r ( B (0); π ) =1 − P r ( z < c u ) q u P r ( z − < c v ) q v . Since, c u and c v are i.i.d random variables, wecan write any probability of form P r ( z R c u ) or P r ( z R c v ) as P r ( z R x ) , where x is an independent random variable with the same distribution as c u and c v . Thus P r ( B (0); π ) = 1 − P r ( z < x ) q u P r ( z − < x ) q v . (5)Now, P r ( χ (1 , P r ( X u = 1 | I k − = z ) P r ( X v = 1 | I k = z + 1) . Event X u = 1 is the union of following two non-overlapping events:1. z ≥ c u ; u accepts the idea because of the threshold rule.2. z < c u and u accepts the idea based on initial acceptance probability, p u .Thus, P r ( X u = 1 | I k − = z ) = P r ( z ≥ c u ) + P r ( z < c u ) p u . Similarly, P r ( X v =1 | I k = z + 1) = P r ( z + 1 ≥ c v ) + P r ( z + 1 < c v ) p v . Therefore P r ( B (2); π ) =( P r ( z ≥ x ) + P r ( z < x ) p u ) × ( P r ( z + 1 ≥ x ) + P r ( z + 1 < x ) p v ) . (6)where we have replaced c u and c v by x because they are i.i.d. random variables. Wecan obtain corresponding probabilities for process π ′ by exchanging p u and p v . Thus, P r ( B (0); π ) = P r ( B (0); π ′ ) = 1 − P r ( z < x ) q u P r ( z − < x ) q v . We can write P r ( B (2); π ′ ) as follows. P r ( B (2); π ′ ) =( P r ( z ≥ x ) + P r ( z < x ) p v ) × ( P r ( z + 1 ≥ x ) + P r ( z + 1 < x ) p u ) . (7)On the other hand P r ( z < x ) ≥ P r ( z + 1 < x ) and P r ( z + 1 ≥ x ) ≥ P r ( z ≥ x ) .Comparing (6) and (7) along with these facts that p v < p u and P r ( z < x ) P r ( z + 1 ≥ x ) ≥ P r ( z ≥ x ) P r ( z + 1 < x ) , we get P r ( B (2); π ) ≥ P r ( B (2); π ′ ) . Case 2: I k − = − z, z > . By a similar analysis, we have P r ( B (2); π ) = P r ( z < x ) P r ( z − < x ) p u p v = P r ( B (2); π ′ ) , (8) P r ( B (0); π ) =1 − ( P r ( z ≥ x ) + P r ( z < x ) q u ) , × ( P r ( z + 1 ≥ x ) + P r ( z + 1 < x ) q v ) , (9) P r ( B (0); π ′ ) =1 − ( P r ( z ≥ x ) + P r ( z < x ) q v ) , × ( P r ( z + 1 ≥ x ) + P r ( z + 1 < x ) q u ) . (10) Comparing (9) and (10), we have
P r ( B (0); π ) ≥ P r ( B (0); π ′ ) . Case 3: I k − = 0 . We have P r ( B (2); π ) = p u ( P r ( x > p v + P r ( x = 1)) , (11) P r ( B (0); π ) = p u + q u P r ( x > p v , (12) P r ( B (2); π ′ ) = p v ( P r ( x > p u + P r ( x = 1)) , (13) P r ( B (0); π ′ ) = p v + q v P r ( x > p u . (14)By comparing (11) with (13) and (12) with (14), we see that P r ( B (2); π ) ≥ P r ( B (2); π ′ ) and P r ( B (0); π ) ≥ P r ( B (0); π ′ ) respectively. Thus, P r ( I k +1 ≥ I k − + x | I k − ; π ) ≥ P r ( I k +1 ≥ I k − + x | I k − ; π ′ ) , ∀ x ∈ Z . Acknowledgments
Authors would like to thank Jon Kleinberg for his useful comments about the motivationof our problem.
References
1. Kempe, D., Kleinberg, J., ´Eva Tardos: Maximizing the spread of influence through a socialnetwork. In: KDD. (2003) 137–1462. Kempe, D., Kleinberg, J., ´Eva Tardos: Influential nodes in a diffusion model for socialnetworks. In: ICALP. (2005) 1127–11383. Mossel, E., Roch, S.: On the submodularity of influence in social networks. In: STOC.(2007) 128–1344. AhmadiPourAnari, N., Ehsani, S., Ghodsi, M., Haghpanah, N., Immorlica, N., Mahini, H.,Mirrokni, V.S.: Equilibrium pricing with positive externalities. In: WINE. (2010) 424–4315. Akhlaghpour, H., Ghodsi, M., Haghpanah, N., Mahini, H., Mirrokni, V.S., Nikzad, A.: Op-timal iterative pricing over social networks. In: WINE. (2010) 415–4236. Hartline, J., Mirrokni, V.S., Sundararajan, M.: Optimal marketing strategies over social net-works. In: WWW. (2008) 189–1987. Arthur, W.B.: Competing technologies, increasing returns, and lock-in by historical events.The Economic Journal (394) (1989) pp. 116–1318. Chierichetti, F., Kleinberg, J., Panconesi, A.: How to schedule a cascade in an arbitrarygraph. In: EC. (2012) 355–3689. Banerjee, A.V.: A simple model of herd behavior. The Quarterly Journal of Economics (3) (1992) 797–81710. Bikhchandani, S., Hirshleifer, D., Welch, I.: A theory of fads, fashion, custom, and culturalchange in informational cascades. Journal of Political Economy (5) (1992) 992–102611. Granovetter, M.: Threshold models of collective behavior. American Journal of Sociology (6) (1978) 1420–144312. Domingos, P., Richardson, M.: Mining the network value of customers. In: KDD. (2001)57–6613. Chen, W., Wang, Y., Yang, S.: Efficient influence maximization in social networks. In: KDD.(2009) 199–20814. Arthur, D., Motwani, R., Sharma, A., Xu, Y.: Pricing strategies for viral marketing on socialnetworks. In: WINE. (2009) 101–112cheduling a Cascade with Opposing Influences 1315. Goyal, S., Kearns, M.: Competitive contagion in networks. In: STOC. (2012) 759–77416. Chen, W., Collins, A., Cummings, R., Ke, T., Liu, Z., Rincon, D., Sun, X., Wang, Y., Wei, W.,Yuan, Y.: Influence maximization in social networks when negative opinions may emergeand propagate. In: ICDM. (2011) 379–39017. Li, Y., Chen, W., Wang, Y., Zhang, Z.L.: Influence diffusion dynamics and influence maxi-mization in social networks with friend and foe relationships. In: WSDM. (2013) 657–66618. Provan, J.: The complexity of reliability computations in planar and acyclic graphs. SIAMJournal on Computing (3) (1986) 694–702 A Examples
Example 1.
Consider a society with areas and types. The planner prior is as follows.Initial acceptance probabilities of areas , , and are . , . , and . respectively.Thresholds of areas , , and are , , and respectively (See Figure 1). Considerspreading strategy π = (1 , , . People in area accept the idea with probability p =0 . . Threshold of area is . It means people in area decide based on initial rule andaccept the idea with probability p = 0 . . Threshold of area is . Thus, people in area decide based on initial rule as well and accept the idea with probability p = 0 . .Therefore, the expected number of adopters for spreading strategy π is p + p + p =1 . . In order to see the impact of an optimal spreading strategy consider spreadingstrategy π ′ = (3 , , . People in area accept the idea with probability p = 0 . .Threshold of area is . It means the decision of people in area is correlated to thedecision of people in area . In other word, people in area follow the decision of peoplein area . Thus, there are two possible scenarios. First, both areas and accept theidea. The probability of this scenario is p = 0 . . The second scenario is that both areas and reject the idea. The probability of the second scenario is − p = 0 . . In bothscenario the threshold of area is hit. Hence, area will accept the idea with probability p = 0 . . Therefore, the expected number of adopters for spreading schedule π ′ is p = 2 . . p = 0 . c = 22 p = 0 . c = 11 p = 0 . c = 33 Fig. 1.
A society with areas. The expected number of adopters for spreading strategy π =(1 , , is . . The expected number of adopters for spreading strategy π ′ = (3 , , is . .4 MohammadTaghi Hajiaghayi, Hamid Mahini, and Anshul Sawant Example 2.
At the first glance, it seems a greedy approach leads us to find the bestnon-adaptive spreading strategy. The greedy approach is to first schedule a node withthe highest probability of adopting. We find a counter-example for this greedy approachwith a society with areas.Consider a society with areas and types. Area has threshold and areas and have threshold . Initial acceptance probabilities are p > p > p = 0 (See Figure 2).The greedy approach leads us to spreading strategy π = (1 , , . Assume the planneruses spreading strategy π . The probability that people in area accept the idea is p .The threshold for area is . Hence, they decide based on initial rule. It means theprobability that people in area accept the idea is p . At last, if both area and acceptthe idea then people in area accept the idea with probability p p based on thresholdrules . Otherwise, they reject it because p = 0 , i.e., area has an initial preference of N for sure. Thus, the expected number of adopter is p + p + p p . Now, assume theplanner uses spreading strategy π ′ = (2 , , . Area accepts the idea with probability p . The threshold of area is . It means area is a follower of area under spreadingstrategy π ′ . Hence, there are two possibilities. Both areas and accept the idea withprobability p or both areas and reject the idea with probability − p . In bothcases area decides based on the threshold rule. Therefore, there are adopters withprobability p or all areas reject the idea with probability − p . Hence, the expectednumber of adopter is p for spreading strategy π ′ . One can check spreading strategy π ′ is better that π for various probabilities p and p , e.g., p = 0 . and p = 0 . or p = 0 . and p = 0 . . p c = 22 p c = 11 p = 0 c = 23 Fig. 2.
A society with areas. The expected number of adopters for spreading strategy π =(1 , , is p + p + p p . The expected number of adopters for spreading strategy π ′ = (2 , , is p . Example 3.
The result of Theorem 1 leads us to the following conjecture for the partialpropagation setting.“Consider an arbitrary non-adaptive spreading strategy in the partial propaga-tion setting. If all initial acceptance probabilities are greater/less than , thenadding an edge to the graph helps/hurts promoting the new product.”. cheduling a Cascade with Opposing Influences 15 This conjecture has several consequences, e.g., a complete graph is the best graph forspreading a new idea when initial acceptance probabilities are greater than . Thiseventuates directly Theorem 1. Surprisingly, this conjecture does not hold. We presentan example with the same initial acceptance probabilities of less than such that addinga relationship between two areas increases the expected number of adopters.Consider a society with areas and only one type. Initial acceptance probabili-ties and thresholds for all areas are p and respectively. Consider spreading strategy π = (1 , , , and a society which is represented by graph G (See Figure 3). Areas , , and decide about the idea independently and accept it with probability p . Thresholdof area is . Hence, people in area accept the idea if there are at least two adoptersso far. Therefore, area accept the idea with probability p (1 − p ) + p and the ex-pected number of adopters is p + 3 p (1 − p ) + p . Assume influences also propagatebetween area and . In this case the society is represented by graph G ′ (See Figure 3).Threshold of area is . Hence, area is a follower of area under spreading strategy π . Thus, there are two possibilities when area is scheduled. Both area and acceptthe idea with probability p or both reject it with probability − p . Area decide inde-pendently and accept the idea with probability p . Threshold of area is . Thus, area is also a follower of both area and . Therefore, the expected number of adopter is p in this case. One can check p + 3 p (1 − p ) + p is greater than p if and only if . < p < . It means when p < . (resp., p > . ) the number of adopters increases(resp., decreases) by adding a relation to the society. G ⇒ G ′ e Fig. 3.
This figure represents a partial propagation setting with areas. All Thresholds are equalto and all initial acceptance probabilities are p . The expected number of adopters for spreadingstrategy π = (1 , , , is p + 3 p (1 − p ) + p for a society which is represented by graph G .The expected number of adopters for spreading strategy π = (1 , , , is p for a society whichis represented by graph G ′ . Note that p (1 − p ) + p is greater than p if and only if . < p < B Type Switching Approach
Consider a society with a constant number of types. One approach that might workis an algorithm that finds an optimal spreading strategy allowing for only a constantnumber of switches between types in a spreading strategy. We note that areas of thesame type are identical from point of view of scheduling a cascade. Thus, any non-adaptive spreading strategy can be specified by specifying types of areas rather than the areas themselves. Let τ be the mapping between an area and its type. That is τ ( i ) is thetype of area i . Let λ be sequence of types for a given spreading strategy. Specifically, λ is a vector whose k th component, λ ( k ) = τ ( π ( k )) . A switch is any position k in thesequence λ such that λ ( k ) = λ ( k +1) . As an example, consider a society with four areaswith two areas of type and two areas of type . Then the type sequence λ = (1 , , , has a switch at position whereas λ = (1 , , , has switches at positions , and .We define a σ -switch spreading strategy as a non-adaptive spreading strategy that hasat most σ switches, where σ is a constant independent of input size. We now prove thatno algorithm whose output is a σ -switch spreading strategy can be optimal. Theorem 2. A σ -switch spreading strategy is a spreading strategy with at most σ switches. For any constant σ , there exists a society with areas of two types such thatno σ -switch spreading strategy is optimal.Proof. The proof outline is as follows. We construct an instance of problem with n areas with two types, the number of areas of both types being n , for which an optimalspreading strategy alternates between these types. Lets call this instance S and lets callthis strategy π . We prove that the expected number of adopters achieved by this optimalstrategy is upper bound on number of acceptors for any input instance with areas ofthese two types, whatever be the number of areas of both types, given that total numberof areas is n , e.g., the number of areas of one type can be n and the other type n − n for any integer n between and n and no strategy for this instance can exceed theexpected number of adopters achieved by π for the instance of problem with n areas ofeach type. We then show that any σ -switch strategy for instance S of problem can beimproved by changing type of one of the areas. Since, the optimal value achieved bythis new strategy cannot be greater than strategy π on instance S , no σ -switch strategycan be optimal.Consider an instance with two types γ = ( P, and γ = ( P, where P > ,the total number of areas is n and the number of areas of types γ and γ is n each.Let π be a spreading strategy for which the type sequence of areas is given by λ =( γ , γ , . . . , γ , γ ) , i.e., every area at odd position is of type γ and every area at evenposition is of type γ . Let the expected number of areas which accept the idea for thisspreading strategy be α . Now consider an instance where the total number of areas is thesame but the number of areas of type γ is n and number of areas of type γ is n − n for some arbitrary natural number n such that ≤ n ≤ n . For this instance, let theexpeted number of areas which accept the idea given an optimal spreading strategy be β . We now prove that α ≥ β . If we have no restriction on the number of areas of eachtype, then for any t = 0 mod 2 , the areas to be scheduled at time t + 1 and t + 2 canbe of types ( γ , γ ) , ( γ , γ ) , ( γ , γ ) or ( γ , γ ) . We prove that α ≥ β by proving thatit is better to schedule areas of type γ and γ at times t + 1 and t + 2 respectively. If | I t | ≥ , then we are indifferent between all spreading strategies because in this caseall the areas will decide based on the threshold rule. Thus, if we can prove that ( γ , γ ) is a best choice for types at times t + 1 and t + 2 when | I t | < , we are done. Since t iseven, the only feasible value of | I t | ≤ is I t = 0 . Thus, this is the only case we needto analyze. Let ρ be the tuple of types of areas scheduled at times t + 1 and t + 2 . Let χ be the tuple indicating decisions of areas scheduled at times t + 1 and t + 2 . Now we cheduling a Cascade with Opposing Influences 17 analyze the probabilties with which the four possible values of χ are realized for eachof the four possible values of ρ when I t = 0 . Let number of areas to be scheduled aftertime t be m . Case 1: ρ = ( γ , γ ) or ( γ , γ ) In this case, the first area decides based on its initial acceptance probability and thesecond area follows the decision of the first area.
P r ( χ = (1 , PP r ( χ = (1 , − P r ( χ = ( − , P r ( χ = ( − , − − P The expected number of areas which accept the idea after time t in this case is mP , asall areas follow the decision of area scheduled at time t + 1 . Case 2: ρ = ( γ , γ ) or ( γ , γ ) In this case, both the areas decide based on their initial acceptance probability.
P r ( χ = (1 , P (15) P r ( χ = (1 , − P (1 − P ) (16) P r ( χ = ( − , P (1 − P ) (17) P r ( χ = ( − , − − P ) (18)From (15), with probability P , all areas after time t will accept the idea. If for any time t ′ , we are given that I t ′ = 0 , then we can treat the subsequent areas as the starting pointof a new spreading strategy. Thus, if I t +2 = 0 , then from Theorem 1 (given that P > ),the expected number of adopters for any future spreading strategy is at least ( m − P .Hence, from (16) and (17), with probability P (1 − P ) the expected number of areasthat will accept after time t is at least m − P . Therefore, in this case, the expectednumber of areas that accept after time t is at least mP + 2 P (1 − P )(1 + ( m − P ) .Thus, we are done if we prove that mP + 2 P (1 − P )(1 + ( m − P ) is greater than mP . mP + 2 P (1 − P )(1 + ( m − P ) − mP = P (1 − P )( − m + 2(1 + ( m − P )) Thus, it is enough to prove that m − P ) − m > . We have: m − P ) − m = (2 P − m − Since
P > , P − > . Thus, for all m > , it is strictly better to schedule anarea of type γ at time t + 2 . If an area of type γ is scheduled at time t + 2 , then it isequivalent to schedule an area of either type at time t + 1 . Thus, given that there is atleast one more area to follow at time t + 3 , it is best to schedule areas of type γ and γ respectively at times t + 1 and t + 2 at any arbitrary time t = 0 mod 2 . Also, sucha schedule is strictly better, all other things begin same, than the schedule where, areasof type γ are scheduled at times t + 1 and t + 2 . This fact is important as we use thislater in the proof. If there are no more areas to follow, then we are indifferent to all the four options. Hence, the expected number of adopters achieved by π is an upper boundon number of acceptors for any input instance with areas of these two types whateverbe the number of areas of both typesThe final part of this proof is by contradiction. Let the the number of areas in theinput instance of problem be n with n areas each of types γ = ( P, and γ = ( P, .Consider a σ -switch strategy. Choose n ≥ σ + 1) . Thus, every σ -switch strategywill have at least four consecutive areas of type γ . Let a σ -switch strategy, π ′ , bean optimal one. Therefore, there will exist a time t in π ′ such that t = 0 mod 2 , τ ( π ′ ( t + 1)) = γ , τ ( π ′ ( t + 2)) = γ and at least one more area will be scheduledafter time t + 2 . As explained earlier, the expected number of adopters in this case isstrictly less than expected number of adopters if we schedule an area of type γ at time t + 2 , which, as proved above, is at most the expected number of adopters for a strategywith type sequence ( γ , γ , . . . , γ , γ ) . Therefore, strategy π is not optimal. This is acontradiction and no σ -switch strategy can be optimal for the given instance. C Hardness Result
We prove that problem of computing expected number of adopters for a given spreadingstrategy in the partial propagation setting is P -complete. This result applies evenwhen the input graphs are planer with a maximum degree of and have only differenttypes of vertices. We prove this by reduction from a version of the network reliabilityproblem that is known to be P -complete ([18]). In the network reliability problem,a directed graph G and probability ≤ p ≤ are given. Nodes fail independentlywith probability − p . Therefore, each node is present in the surviving subgraph withprobability p . We achieve the reduction by simulating the s − t network reliabilityproblem by designing an instance of cascade scheduling problem where, probability ofan area v accepting an idea is exactly equal to a path existing in the surviving sub-graphfrom the source to vertex v . Before proceeding to details of the proof, we give somedefinitions below. Definition 1
Given a directed graph G with source s , terminal t , and a probability − p, ≤ p < of nodes failing independently, the ( s, t ) -connectedness reliability of G , R ( G, s, t ; p ) , is defined as the probability that there is at least one path from s to t such that none of the vertices falling on the path have failed. Definition 2
AST is the problem of computing R ( G, s, t ; p ) when G is an acyclic di-rected ( s, t ) -planar graph with each vertex having degree at most three. We denote aninstance of AST on graph G as AST ( G, s, t, p ) . Definition 3
Given an influence spread process, S = ( G, c , p , π ) on G with a sourcenode s and a target node t , IST is the problem of computing
P r ( X t = 1; S ) given that π (1) = s and P r ( X s = 1) = 1 . We denote an instance of IST by IST ( G, c , p , π, s, t ) . We will reduce an instance of AST to an instance of IST (Probability of InfluenceSpread to T). cheduling a Cascade with Opposing Influences 19
Given an instance of AST,
AST ( G = ( V, E ) , s, t, p ) we now construct an instanceof IST, IST ( G ′ = ( V ′ , E ′ ) , c , p , π, s, t ) for which R ( G, s, t ; p ) = P r ( X t = 1) . Let d inv be the indegree of v ∈ V in G . For every vertex v ∈ V − { s } , we add three verticesto graph G ′ . Lets denote them by b v , the blocking vertex of v , f v , the forwarding vertexfor v and v ′ , which corresponds to the original vertex v . The rationale for nomenclaturewill become apparent later. For every edge ( u, v ) in E , we add an edge { u ′ , b v } in E ′ .In addition, we add edges { b v , v ′ } and { f v , v ′ } to E ′ . The acceptance probabilities andthresholds are set as follows: p v ′ = 0 , p f v = p, p b v = 1 ∀ v ∈ V − { s } , p s ′ = p . c v = 2 , c b v = d inv ∀ v ∈ V − { s } . Threshold c s ′ is irrelevant and can be any arbitraryvalue greater than since it is the first vertex to be scheduled. Thresholds c f v can alsobe any arbitrary value greater than since no neighbor of f v is scheduled before f v . Let π ′ : V V be any topological ordering on V where, s is the first node and t is the lastnode. Then π is constructed as follows: π − ( s ′ ) =1 π − ( v ′ ) =3 π ′− ( v ) − ∀ v ∈ V − { s } π − ( b v ) =3 π ′− ( v ) − ∀ v ∈ V − { s } π − ( f v ) =3 π ′− ( t ) − ∀ v ∈ V − { s } The above construction of π can be interpreted as follows. Source remains the firstvertex to be scheduled. A vertex v is split into three vertices — v ′ , b v and f v . In placeof v , these three vertices are consecutively scheduled in order b v , f v and v ′ , e.g., if π ′ = ( s, v, t ) , then π = ( s ′ , b v , f v , v ′ , b t , f t , t ′ ) .Let IS be the influence spread process ( G ′ , c , p , π ) . Now, we prove the followinglemmas which relate the probability of existence of a path of operative vertices between s and v in G and the probability that area v accepts the idea in the influence spreadprocess IS . vu d u u vf v b v u d u u f u d b u d f u b u f u b u Fig. 4.
Reduction from Network Reliability on a DAG to Computing Expected Number of Influ-enced Nodes – The diagram on left is a part of DAG with probability of failure of each node equalto (1 − p ) . The diagram on right is corresponding part of graph that represents an influence spreadstochastic process the models the given network reliability problem where p b v = 1 , c b v = d , p f v = p , p v ′ = 0 , and c v ′ = 2 .0 MohammadTaghi Hajiaghayi, Hamid Mahini, and Anshul Sawant We first prove that computing the expecte number of vertices in graph to which s has a path with operating vertices is P -complete. We then use this to prove the maintheorem. Lemma 2
Consider an instance of AST,
AST ( G = ( V, E ) , s, t, p ) . Then computingthe expected number of vertices in graph to which s has a path with operating verticesis P -complete.Proof. Let a ( G, s ) be the expected number of vertices in the graph to which s has apath with operating vertices in G . Let b ( G, s, t ) be probability that there is a path ofoperating vertices from s to t in G . We note that t has no outgoing edges. Lets assumethat a ( G, s ) can be computed in time polynomial in | G | . Let G ′ = G − { t } . Deletionof t does not change probability of survival of any path whose destination is not t .Therefore a ( G ′ , s ) = P u ∈ V −{ t } b ( G, s, u ) . Thus, a ( G, s ) − a ( G ′ , s ) = b ( G, s, t ) .This is a contradiction because this implies that b ( G, s, t ) can be computed in timepolynomial in | G | .The proof of the main theorem of this section is organized as follows. We first provethat the probability of an area v ′ accepting an idea is exactly equal to probability of apath existing from s to v . Then, we use this fact along with Lemma 2 to prove the mainresult. Theorem 4.
In the partial propagation setting, it is P -complete to compute the ex-pected number of adopters for a given non-adaptive spreading strategy π .Proof. Let
AST ( G = ( V, E ) , s, t, p ) be an instance of AST problem. Let S ( G ′ =( V ′ , E ′ ) , c , p , π ) be an influence spread process with G ′ , c v , p v and π as defined above.Then an area v = s, t accepts the idea with probability p iff at least one of its predeces-sors in G also accepts the idea.Let P ( v ) be the set of predecessors of v in G . We note that in IS , by constructionof π and G ′ , vertices in P ( v ) are exactly the neighbors of b v that are scheduled before b v . Area b v is immediately followed by f v and f v by v . Also, by construction of G ′ , b v and f v are neighbors of v and v has no other neighbors. Area f v ’s only neighbor is v .If no vertex in P ( v ) accepts the idea, then D ( b v ) = − d inv = − c b v and thus, P r ( b v = − | no vertex in P ( v ) accepts the idea ) = 1 and therefore, b v rejects theidea. Since, threshold of v is c v = 2 , v decides based on threshold if and only if bothits neighbors either accept or reject the idea. Therefore if b v rejects the idea, then if f v accepts the idea, then v does not accept the idea because it decides to reject the ideabased on its initial acceptance probability as p v = 0 . If X f v = − , then also v doesnot accept the idea because it reject the idea based on threshold rule, because both itsneighbors rejected this idea. Thus, if none of the vertices in P ( v ) accept the idea then v does not accept the idea.If any area in P ( v ) accepts the idea then − c b v = − d inv < D ( b v ) < d inv = c b v and b v accepts the idea because its initial acceptance probability, p b v = 1 . Now, if f v accepts the idea then v also accepts because c v = 2 and if f v rejects the idea, then v does not accept the idea because it decides to reject it on basis of its initial acceptanceprobability, p v = 0 . Since, no neighbor of f v is scheduled before f v , f v accepts the idea cheduling a Cascade with Opposing Influences 21 independently at random with its initial acceptance probability p f v = p . Therefore,given that at least one vertex in set P ( v ) accepts the idea, v accepts the idea withprobability p .Now, by principal of deferred decisions, process of finding a path of operating ver-tices from s to t in the network reliability problem, can be simulated as follows. Let π be any topological ordering on vertices of G . Let L ( i ) be the i th layer (excluding layercontaining just the source vertex, s ) in topologically sorted G . Then probability that apath to u ∈ L (1) exists is p because we let each vertex in this layer fail independentlywith probability − p . For vertex v in any subsequent layer, if there exists a path toany of vertices in P ( v ) , the set of predecessors of v , then we let v fail independentlywith probability − p . If no path to any of predecessors of v exists, then no path to v can exist and it is immaterial whether v fails or not. Thus, we let v fail with probability . As explained above, this is exactly the process simulated by IS ( G ′ , c v , p v , π ) . Thus,computing P r ( X t = 1) is P -complete.However, we need to prove hardness of computing Λ = P u ∈ V ′ P r ( X u = 1) . Ifwe can prove that from Λ we can compute the expected number of vertices in graph towhich s has a path, say α = P v ∈ V P r ( X v ′ = 1) , then from Lemma 2, we are done.Since ∀ v ∈ V, P r ( X v ′ = 1) = P r ( X b v = 1) · P r ( X f v = 1) = P r ( X b v = 1) · p and P r ( X f v ) = p , we have: Λ = X v ∈ V ( P r ( X v ′ = 1) + P r ( X b v = 1) + P r ( X f v = 1)) = X v ∈ V ( P r ( X v ′ = 1) + P r ( X v ′ = 1) p + p ) From above, we can easily compute α . Hence, the claim follows.We note that AST is P -complete even when degrees of vertices of the input graphis constrained to be . Thus, indegree of a node (through which a path from s to t canpass) has to be or . If p is the survival probability of a vertex in the AST probleminstance, then the possible types of areas in the corresponding instance of IST are in { (1 , , (1 , , ( p, , (0 , } , where the first two types correspond to blocking nodes in G , the forwarding nodes are of type ( p, and the vertices corresponding to original ver-tices are of type (0 , . Thus, IST is hard on graphs with maximum degree constrainedto and number of types constrained to . D Computing Expected Number of Adopters
Here we give an algorithm to compute E ( I n ) , given a spreading strategy π with thresh-olds given by vector c and initial probabilities of acceptance given by vector p . Let Y k be the number of decisions among vertices in { π (1) , π (2) , . . . , π ( k ) } . We note that I k = 2 Y k − k . Since E ( I n ) = P i ∈{ ...n } xP r ( I n = x ) , we are interested in computing P r ( I n = x ) , ∀ x ∈ {− n . . . n } . Theorem 5.
Consider a full propagation setting. The expected number of adopter canbe computed in polynomial time for a given non-adaptive spreading strategy π . Let A be a n × (2 n + 1) matrix where A [ k, x ] = P r ( I k = x ) , k ∈ { . . . n } , x ∈{− n . . . n } . Let v = π ( k ) . The following recurrence might be used to arrive at a dy-namic programming formulation: A [ k, x ] ← P r ( X kv = 1) A [ k − , x −
1] +
P r ( X kv = − A [ k − , x + 1] However, one needs to be careful when computing
P r ( X kv = 1) because it is dependentof I k − . Thus, in the correct recurrence we must have P r ( X kv = 1 | I k − = x − and P r ( X kv = − | I k +1 = x + 1) instead of P r ( X kv = 1) and P r ( X kv = − respectively.Below we derive the dynamic program keeping this subtelty in mind. Let v = π ( k + 1) .We have: P r ( I k +1 = x + 1 | I k = x ) = p v if − c v < x < c v if x ≥ c v otherwise P r ( I k +1 = x − | I k = x ) =1 − P r ( I k +1 = x + 1 | I k = x ) We have:
P r ( I k +1 = x ) = P r ( I k +1 = x | I k = x − P r ( I k = x − P r ( I k +1 = x | I k = x + 1) P r ( I k = x + 1) The above relation suggests a dynamic program for computing E ( I n ) . The matrix A is initialized with A [1 ,
1] = p π (1) , A [1 , −
1] = 1 − A [1 , , A [1 ,
0] = 0 , A [ k, x ] =0 , ∀ x > k, A [ k, x ] = 0 , ∀ x < − k . When | x | < n, k > , then any A [ k, x ] depends on A [ k − , x + 1] and A [ k − , x + 1] and we get the recurrence: A [ k, x ] ← P r ( I k = x | I k − = x − A [ k − , x − P r ( I k = x | I k − = x + 1) A [ k − , x + 1] From A , E ( I n ) can be computed as follows: E ( I n ) = X i ∈{ ...n } xP r ( I n = x ) = X i ∈{ ...n } iA [ n, i ] E Adaptive Marketing Strategy
In this section we propose a dynamic program for computing best adaptive spreadingstrategy and thus, prove Theorem 6. Here we give dynamic program when there are twotypes of areas. This can be extended to any constant number of types. Let B ( n , n , k ) be the expected number of areas that adopt the product for a best ordering where n isnumber of areas of type and n is the number of areas of type in the market k is sumof decisions of vertices that have been scheduled so far. We note that deployment num-ber k is equal to difference of number of yes decisions and no decisions. Let thresholdsand initial acceptance probabilities for vertices of type i be c i and p i . At any given timein the strategy, let B i be the best possible result if an area of type i is scheduled next.Depending on value of k , we have the following cases (cases 2 and 4 will not occur if c = c ): cheduling a Cascade with Opposing Influences 23 n = 0 ∨ n = 0 : If all areas are of the same type, then all spreading strategiesare equivalent and we can choose any arbitraty spreading strategy for the remainingareas.2. c ≤ k < c : In this case, areas of type will accept the idea w.p. . Areas of type will accept the idea with probability p and reject it with probability − p . B =1 + B ( n − , n , k + 1) B = p + p B ( n , n − , k + 1) + (1 − p ) B ( n , n − , k − B ( n , n , k ) = max { B , B } − c < k < c : In this case, both types of areas will decide to accept or reject theidea on basis of initial acceptance probabilities. Therefore: B = p + p B ( n − , n , k + 1) + (1 − p ) B ( n − , n , k − B = p + p B ( n , n − , k + 1) + (1 − p ) B ( n , n − , k − B ( n , n , k ) = max { B , B } − c < k ≤ − c : In this case, areas of type will reject the idea with probability and areas of type will accept the idea with probability p . B = B ( n − , n , k + 1) B = p + p B ( n , n − , k + 1) + (1 − p ) B ( n , n − , k − B ( n , n , k ) = max { B , B } k ≤ − c : In this case, both types of areas will reject the idea. Therefore: B ( n , n , k ) = 0 k ≥ cc : In this case, both types of areas will reject the idea. Therefore: B ( n , n , k ) = n + n This can easily be extended to any constant number of types. The time complexity with t types is O ( n t +1 ) . F Missing Proofs
F.1 Proof of Lemma 1
Proof.
We prove this lemma by proving that:
P r ( I k + t ≥ x ; π ) ≥ P r ( I k + t ≥ x ; π ′ ) , ∀ t ∈ { . . . n − k } (19)We note that the above implies E ( I n ; π ) ≥ E ( I n ; π ′ ) . We prove that if P r ( I k ≥ x ; π ) ≥ P r ( I k ≥ x ; π ′ ) then P r ( I k +1 ≥ x ; π ) ≥ P r ( I k +1 ≥ x ; π ′ ) for all x ∈ Z .This argument can be successively applied to prove (19). Let π ( k + 1) = v . X v will be iff either I k ≥ c v and v accepts idea based on threshold rule or − c v < I k < c v and v decides to accept the idea based on initial acceptance probability p v . Thus: P r ( X v = 1) = P r ( I k ≥ c v ) + P r ( − c v < I k < c v ) p v Substituting
P r ( − c v < I k < c v ) = P r ( I k ≥ − c v + 1) − P r ( I k ≥ c v ) , we have: P r ( X v = 1) = P r ( I k ≥ c v ) + ( P r ( I k ≥ − c v + 1) − P r ( I k ≥ c v )) p v By rearranging the terms, we get: