Chasm in Hegemony: Explaining and Reproducing Disparities in Homophilous Networks
Yiguang Zhang, Jessy Xinyi Han, Ilica Mahajan, Priyanjana Bengani, Augustin Chaintreau
CChasm in Hegemony: Explaining and Reproducing Disparities inHomophilous Networks
Yiguang Zhang , Jessy Xinyi Han , Ilica Mahajan , Priyanjana Bengani , and AugustinChaintreau Columbia University Massachusetts Institute of Technology
Abstract
In networks with a minority and a majority community, it is well-studied that minorities areunder-represented at the top of the social hierarchy. However, researchers are less clear aboutthe representation of minorities from the lower levels of the hierarchy, where other disadvantagesor vulnerabilities may exist. We offer a more complete picture of social disparities at each sociallevel with empirical evidence that the minority representation exhibits two opposite phases: at thehigher rungs of the social ladder, the representation of the minority community decreases; but,lower in the ladder, which is more populous, as you ascend, the representation of the minoritycommunity improves. We refer to this opposing phenomenon between the upper-level and lower-level as the chasm effect . Previous models of network growth with homophily fail to detect andexplain the presence of this chasm effect. We analyze the interactions among a few well-observednetwork-growing mechanisms with a simple model to reveal the sufficient and necessary conditionsfor both phases in the chasm effect to occur. By generalizing the simple model naturally, we presenta complete bi-affiliation bipartite network-growth model that could successfully capture disparitiesat all social levels and reproduce real social networks. Finally, we illustrate that addressing thechasm effect can create fairer systems with two applications in advertisement and fact-checks,thereby demonstrating the potential impact of the chasm effect on the future research of minority-majority disparities and fair algorithms.
The "glass-ceiling" effect has multiple real-world applications; it is invoked when describing the invis-ible barrier that women — or any minority group — hit in their career as they approach the upperechelons of management[1][2]. The top of the hierarchy has been well studied, whereas research on mi-nority representation in the rest of the social hierarchy has received less attention. Having a completecharacterization of social disparities at all levels of the hierarchy helps tackle questions including thepoint at which a minority group starts experiencing a systemic disadvantage, and at what rung of theladder — if any — are minorities fairly represented.We tackle these questions leveraging real-world datasets (QQ, WhatsApp, and Instagram) in anattempt to understand the distribution of minority representation across the entire hierarchy. Ourmain finding is the surprising but repeated evidence that the ratio of people belonging to a minoritygroup initially increases as one moves up in the lower layers of the hierarchy, before it reaches aplateau and drops. We refer to this effect as a “chasm” because people who observe the lower orupper layer of a hierarchy might agree that a systemic bias is present but would hastily claim it isin opposite directions. This is in striking contrast to the monotonic behavior one would expect in allprevious systemic models of hegemonic biases. As we prove that previous models cannot explain ourobservation, we also provide the first generative model that offers a simple explanation and is generalenough to apply broadly. a r X i v : . [ c s . S I] F e b he question we ask in this paper addresses the causes of this chasm effect. What are the mech-anisms that interact with each other to create both the glass-ceiling effect and the chasm effect, andin particular, how do social networks play a role in creating these two effects?Previous studies on the glass-ceiling effect have provided mechanisms that capture the glass-ceilingeffect in uni-partite networks like Facebook[3]. However, the same mechanisms do not capture thechasm effect. In this paper, we primarily focus on bipartite networks which have two types of entitieswith different natures, while we also show the present of the chasm effect in unipartite networks.We are interested in bipartite networks like WhatsApp for two reasons: (1) the nature of bipartitenetworks is less understood but more intriguing due to their complexity; (2) many social platformsare now group-based where members find communities of their interests within the larger network.We analyze the interactions among a few well-observed network-growing mechanisms with a simplemodel to reveal the sufficient and necessary conditions for both the glass-ceiling effect and the chasmeffect to both be present. We further generalize the simple model naturally and present a completebi-affiliation bipartite network-growth model. We demonstrate our proposed model’s effectivenessthrough both mathematical proofs and data synthesis. Our generative model is the first to capturethe chasm effect in social disparities.This study has important practical applications, especially as it puts a spotlight on structural biasesin bipartite networks and hints at ways to address them. More specifically, the new idea of chasm effectwe put forward provides a foundation for allocating resources differently in diverse settings to minimizebias among those people who constitute a large portion of the population and more disadvantagedand vulnerable. We present two examples taken from different contexts: (1) (gender fairness) wehope to provide recruiters with a better job placement strategy if they want to diversify their pool ofcandidates; (2) (political fairness) in politics-related group chats where conversations are not accessibleoutside the immediate community, we aim to show how fake-news can have more of an adverse impacton the minority population in a constrained environment.As a summary, our main contributions are:• We prove the existence of the chasm effect with empirical evidence from real-world datasets,and characterize the phenomenon in-depth to provide a more complete picture of social parities.That is, we show that the ratio of the minority community does not decrease monotonically aswe move up the hierarchy. (Section 3)• We analyze the interactions among network-growth mechanisms and derive the necessary mech-anisms for both the chasm effect and the glass-ceiling effect to be present in bipartite networks.(Section 4)• We propose a complete bipartite-bi-affiliation network-growth model that generalizes the nec-essary mechanisms discussed in Section 4. The generalized model is capable of reproducingreal-world social networks. Under the generalized model, we provide proofs to show that bothtypes of entities in the generated networks have power-law degree distributions, and specify thesufficient and necessary conditions mathematically for both the glass-ceiling effect and the chasmeffect to present. (Section 5)• Finally, we provide two real-world applications of our findings, job advertisement and fact-checking, where the chasm effect could impact the direction of bias, thereby motivating theimportance of considering the chasm effect. (Section 6)Those results together suggest that the chasm effect can be observed, at least frequently in onlinenetworks which may exhibit simple selective homophily dynamics, and has consequences. We urge somecaution as our results do not, however, prove that the chasm is unavoidable: Some social networks (and,under some conditions, our general model) can exhibit a systemic monotonic bias against minoritygroups at all level of the hierarchy. 2 Related Work
Social disparities and the hegemony of the majority community have been widely studied in uni-partite social networks, and it has been well-observed that disadvantages are exerted on the minoritycommunity, for example, in the case of the gender gap[1][2] or rural-urban inequality[4]. It has alsobeen shown, through homophilous preferential attachments, that structural bias in uni-partite socialnetworks can create such disadvantages[3] at the top of the hierarchy and the effects can be reinforcedwhen recommendation algorithms are applied[5]. However, no existing model analyzes the structuralbiases that may exist beyond the top of the hierarchy.Further, studying hegemony is no longer straightforward in bipartite networks. Often, the bipartitenetorks are comprised of different types of entities and it is only meaningful to study homophilywithin a single entity.
Projection can convert bipartite networks back to uni-partite networks, butthis loses important network information[6]. Therefore, a model that studies hegemony directly onbipartite networks is imperative. Unfortunately, there are not many bipartite network models andeven fewer studies on social disparities. Previous analytical literature[7] and[6] provide notationsstudying bipartite networks and extend several common notations in uni-partite network to bipartitenetworks, but those do not consider hegemony. Random graphs models like Stochastic Block Modelcan be used to model homophily, but do not reproduce the large range of degrees[8] well-observed insocial networks. Configuration models like exponential random graph models[9] can be modified tostudy homophily but are restricted by nature to static graphs with no internal reinforcing dynamics.We hereby introduce the first generative model that can be used to analyze hegemony in bipartitenetworks.One important application of bipartite networks is fairness in fake news detection in group-membernetworks . In the last decade, researchers have expended tremendous efforts attempting to automati-cally detect fake news by analyzing texts[10][11], images[12], propagation models[13], and more[14][15].Most auto-detection methods apply only to public social media where platforms have access to all con-tent. However, on private platforms (such as WhatsApp, which is end-to-end encrypted), platformsare unable to proactively auto-detect misinformation due to the lack of visibility into the content.Instead, one of the ways in which they detect potentially inaccurate political news is through user re-ports. Due to the diversity of information and the massive volume of queries received, stories reportedas fake by a large number of users are often prioritized by fact checkers[16]. When there is more thanone political party in the network, such detection methodologies may create unfairness as the partywith more members could be subject to more misinformation.To the best of our knowledge, our results are the first to tackle factors affecting fairness of fakenews detection in encrypted social media. Here we define a hegemonic subset as one that is systematically over-represented among the tail ofmost popular nodes. It was shown that a majority affiliation among the nodes can become hegemonicunder simple rich-get-richer and homophily dynamics[3]. We find, among three large networks withaffiliations, including for a bipartite graph, that such hegemonic subsets always exhibit a remarkableparadox: It appears, starting at small degrees, that members of the hegemonic subset are becomingscarcer as degree increases, while the fraction of members from other subset initially increases. Thiscreates a chasm since, assuming one concentrates on a partial local observation of the degree distribu-tion, one may hastily conclude that network growth either disproportionally favors or disfavors thosein the hegemonic subset.
QQ dataset[17]
One of the most popular instant messengers for group chats in China is Tencent QQ,which has over 700 million active users. Users can create new groups or join existing ones. Depending
3n the account level of the group creator, QQ group sizes are capped at 100, 200, 500, or 1000 members.This dataset contains 274,335,183 users and 58,523,079 groups, of which 273,204,518 users have genderinformation and 48,676,355 groups have the complete information about the group identifier, memberlist, and group creation date. Females take up 42.5% of the users in this dataset; hence, we labelgroups whose members are less than 42.5% female as male-dominated groups and those with withmore than 42.5% females are classified as female-dominated.We observe that the group size distribution of this dataset becomes discontinuous at 100, 200, and500 members due to the imposed group-size caps. To avoid the impacts of the discontinuity, we focusour analysis on groups of sizes no larger than 100, which account for 99.2% of all groups in the datasetand have an the average ratio of female membership in a group of 40.9%.
WhatsApp dataset[18][19]
WhatsApp is one of the most widely-used messaging apps aroundthe world. The WhatsApp data we use was collected over a period of 9 months from October 2018to June 2019. It includes 2,092 groups around political conversations and 205,880 unique users. Theparty affiliation of each group is labeled according to the group title and some of its content by authorsin[19]. Based on the ideology and relevant reports of the group’s party affiliation, we characterize eachgroup’s political leaning as pro-BJP or anti-BJP where BJP stands for Bharatiya Janata Party, thecurrent ruling party in India. To obtain sound and rigorous results, we only consider groups wherethe political leanings are evident. Once we identify the political leanings of groups, we label eachuser as pro-BJP if the ratio of pro-BJP groups the user joined exceeds the overall pro-BJP groupratio in the dataset and vice versa. Overall, we get 1,198 pro-BJP groups and 465 anti-BJP groups,with 62,920 pro-BJP members and 21,625 anti-BJP members sharing 897 images manually labelled asmisinformation.The data are very sparse for groups with more than 165 members in this dataset, so we restrictour analysis to groups of size less than 165. Furthermore, since WhatsApp is an end-to-end encryptedapplication where members have a reasonable expectation for privacy, we drop all groups with lessthan 52 members ( % of the maximum group size). Instagram dataset[5]
Instagram is a photo- and video-sharing platform where people like andcomment on content. The Instagram dataset was collected between 2014 and 2015 and has a totalof 553,628 different users whose genders were inferred from their names. Females make up 54.4%of all users in this dataset. Even though females makeup more than half of the data, they are stillconsidered the disadvantaged for two reasons: (1) other features of this network, like the degreedistribution, suggest a bias against the female users; (2) keep it consist to prior works.
An unequal proportion of two affiliations arise in different identity contexts like gender (social identity)and political leaning (political identity). In the QQ dataset which illustrates the social identity aspect,female members make up 42.5% of the population and 41.0% of the groups are female-dominated, thusfemales are considered as the minority and male the majority; in our WhatsApp dataset which exhibitsthe political identity aspect, 25.6% of all members and 28.0% of all groups are anti-BJP, thus anti-BJP is denoted as the minority and pro-BJP the majority. Despite the completely different nature ofthe majority-minority groups in these two datasets, our later findings will show that they share somesimilar properties, which is worth further study.
The degree distribution of a network reflects how the resources and power are distributed in soci-ety. Previous studies on one-mode social networks demonstrate a “rich-get-richer” mechanism[20][8],suggesting that those with more connections have an advantage in building even more connections.In bi-affiliation bipartite networks, we study each affiliation and each type of entities separately. Wetake the number of members within a group as degree of the group entities and take the number ofgroups a member joins as degree of member entities. We find a smooth slow decay in small degrees4igure 1:
Homophily mechanism : we observe in both the QQ dataset, where the minority-majorityimbalance often arises in the context of gender disparity, and in the WhatsApp dataset, where theimbalance often arises in the context of political parties, the network exhibits homophily. That is,people have the tendency to connect with the ones of their own affiliation.and a fast decay in large degrees for both the group size distributions and member degree distribu-tions, exhibiting a similar “rich-get-richer” result as in one-mode networks. Specifically, in the QQdataset, female-dominated groups follow a power law with power -4.00 and their male counterparts,-3.51; QQ female members follow a power law with power -3.82 and their male counterparts the same;in the WhatsApp dataset, anti-BJP groups follow a power law with power -2.67 and their pro-BJPcounterparts -2.48; WhatsApp anti-BJP members and their pro-BJP counterparts follow a power lawwith almost identical power, -2.29 and -2.23 respectively. This “rich-get-richer” result on bi-affiliationbipartite networks illustrates a few basic ideas on member-group interactions: (1) members are morelikely to join large groups, likely due to large groups’ popularity or their potential to offer more re-sources; (2) this higher tendency of members to join large groups is more pronounced when joiningmajority groups; (3) members who are active in joining groups are more likely to join new groups thanthose who are less active.
Homophily is a well-observed phenomenon that says that people tend to connect with those who aresimilar to them[21]. To test for homophily, we count the number of minority-majority member pairs.Specifically, two members form a member pair if they are both in the same groups, and they havemultiple pairs if they share multiple common groups. We count the number of member pairs in thenetwork such that one end of the pair is a member from the minority affiliation and the other endis a member from the majority. Note that when there is no homophily in the network, the ratio ofminority-majority member pairs over all member pairs is r (1 − r ) , where r is the percentage of redmembers in the network. Having the actual minority-majority member pairs is less than the expectednumber of minority-majority member pairs is therefore an indication for homophily.Both the QQ dataset and the WhatsApp dataset show a strong indication for homophily, asthe actual number of minority-majority member pairs (orange line) is significant smaller than theexpected value when assume no homophily (yellow line). Therefore, we conclude that homophilyexists in bipartite networks.As a conclusion, our analysis on the real-world data illustrates the following three mechanisms inbi-affiliated group-member networks:1. minority-majority affiliation: the two affiliations have non-negligible size differences.2. rich-get-richer: new members are more likely to join large groups; members who are active injoining groups are more likely to join new groups than those who are less active.3. homophily: members are more likely to join groups of their own affiliation.5igure 2: Chasm effect on group ratio: we observe that in both datasets, the ratio of minority groups(vs majority groups) is not monotone. As expected from the glass-ceiling effect, the ratio decreases forlarge group sizes; however, it increases for small groups, which constitutes a larger parts of all groups.In this plot, the radius of light-red circles are in proportional to groups counts.Figure 3: Chasm effect on average member ratio: we again observe that the average ratio of minoritieswithin groups of fixed sizes first increases, and then decreases. The radius of light-red circles areproportional to the sum of group sizes. While the glass-ceiling effect depicts the under-representation of minorities at the higher rungs, wezoom out to study the minority representation across the social hierarchy. We find that at the lowestlevel, minorities are also under-represented and this under-representation eases as they move up thesocial ladder but deteriorates closer to the top. This matches the glass-ceiling effect at the higherlevels. As the minority representation exhibits opposite trends when we move up in the lower rungsand in the upper rungs, we refer to this phenomenon as the “chasm effect” between the lower-level andupper-level.In our bi-affiliated bipartite networks, we observe this chasm effect for both the group mode andmember mode. As shown in Fig. 2, we calculate the ratio of minority-dominated groups each groupsize bucket and find that the minority group ratio does not monotonically decrease. More specifically,in the QQ dataset, we observe that the ratio of female-dominated groups increases among groups ofsize 1-55 and decreases afterwards. In the WhatsApp dataset, the ratio of the anti-BJP group increasesfor groups of size 52-85 and decreases thereafter. In both plots, we see that the very small and verylarge minority groups are under-represented and the representation improves in medium-sized minoritygroups. Similarly, in Fig. 3, we calculate the average ratio of minority members at each level of groupsizes and find a similar non-monotonic trend. In the QQ dataset, the average ratio of female membersin a group first increases among groups of size smaller than 55 and decreases afterward; similarly, theaverage proportion of anti-BJP members in the WhatsApp dataset increases among groups of size lessthan 82, and decreases thereafter. In both plots, we see that minority members are under-representedin the very small, and the representation gets improved in middle-sized minority groups.The above observations have not been studied in the existing literature of social networks but they6igure 4: Chasm effect on unipartite networks: the chasm effect is not unique to bipartite networks.We observe in the projected QQ membership networks, as well as in the Instagram network, that theratio of female connections a member has first increase, then decreases. This common pattern sharedby networks of different type, as well as networks focusing on different context, indicates that theremay be simple structural patterns that are not explained in the existing literature.are non-negligible. First, smaller groups constitute a significant portion of all groups in the networks:40.9% groups have sizes smaller than 55 in the QQ dataset, and 41.7% groups have sizes smallerthan 82 in the WhatsApp dataset. Furthermore, this observation is not unique to bipartite networksas we find a similar non-monotonic result in both projected networks and one-mode networks. Wefirst project the original QQ group-member network to construct a QQ membership network, wheremembers consist of the node-set and two nodes are joined by an undirected edge if and only if they arein the same group. If the two members share multiple groups, they are connected by multiple edges.We define “member degree” as the number of connections a member has and calculate the proportionof minority connections at each level of member connections. In Fig. 4, we find the average ratio offemale connections rises among members with fewer than 95 connections ( 55.3% of the population)and decreases thereafter. A similar trend can be found in the one-mode Instagram dataset: we seethat the average ratio of female connections increases among users with fewer than 75 connections( 99.7% of the population) and decreases afterward.This more complete picture of minority representation in every level of a social hierarchy is es-pecially significant as it can provide insights into minorities at the lower rungs who are far moredisadvantaged and vulnerable than those at the higher level. Previous models of network growth withonly the three mechanisms discussed in section 3.2 are unable to capture or explain this chasm effect(proved in Section 4). This motivates us to propose a new bi-affiliation bipartite network model inthe next sections that could reveal the complex interaction among several driving mechanisms of thesocial disparities.
We now examine the roles played by the observed mechanisms, and the way they interact with eachother, as well as another well-observed social network mechanism, to create the glass-ceiling and thechasm effect. To better characterize the interactions, we use a simple model to show that the twoeffects can naturally arise under a specific combination of the network mechanisms. What’s more, themechanisms that constitute this combination are necessary conditions for the two effects to occur atthe same time.
Formally, we consider a bi-affiliated bipartite network, with one subset of nodes representing members, M ,and the other groups, G . We assume two affiliations in the network and we denote them as red and7igure 5: SHM and GSHM: in SHM, at each time t , exactly one connection is built between the set ofmembers and the set of groups. A chosen member can either create a group or join an existing group.If joining an existing group, the member selects a group based on either (1) rich-get-richer mechanism,or (2) equal-chance mechanism. If homophily is applied to the chosen mechanism, then the membermay reject the connection and choose a new group until successfully joining a group. The GSHMfollows the same step, except that in SHM, the homophily level is the same for different mechanismsand members from different affiliations, while GSHM has differentiated homophily levels.blue, where the red affiliation represents the minority, and the blue affiliation represents the majority.Every member m ∈ M belongs to exactly one of the two affiliations. Similarly, every group g ∈ G belongs to one affiliation. We use N ( M ∪ G, t, Θ) to denote a network generated with a model by Θ at time step t where Θ is the set of parameters that is used to generate networks.We assume the following well-observed mechanisms:1. rich-get-richer: current active members are likely to join more groups than current inactivemembers; large groups are likely to have a higher growth rate than small groups.2. homophily: members tend to join groups of their own affiliation.3. equal-chance: members may join groups uniformly at random.Applying the homophily mechanism to the other two give rise to three possible homophilous mecha-nisms. Namely, selective homophily on rich-get-richer , selective homophily on equal-chance , and generalhomophily . We now test each of them in a simple network growth model.Formally, we have Θ = ( α, η, r, ξ, ρ ) , where α and η captures the arrival rate of members andgroups, respectively, r ≤ / represents the likelihood of a new arrival member being red, ≤ ξ ≤ captures the level of the rich-get-richer mechanism for groups, and ρ represents the level of homophilyin the network.We now describe this selective homophily model (SHM) in more details, and demonstrate it inFigure 5 and Figure 6.At time t = 2 , we initialize the bipartite network with one red member connecting to a red group,and one blue member connecting to a blue group. At time t , the network grows as follows:• Member Growth : – ( minority - majority ) with probability α ( < α < ), a new member m ∗ joins the network,and it is colored red with probability r ( < r ≤ / ); – ( rich-get-richer ) otherwise, with probability − α , we randomly pick an existing member m ∗ with a probability proportional to deg ( m ∗ ) .• Group Growth : with probability η ( < η < ), the member creates a group of color c ( m ∗ ) .8 Connection Growth : with probability − η , the member m ∗ joins an existing group, accordingto the following two steps: – ( rich-get-richer ) with probability ξ , m ∗ picks a group g ∗ with probability proportional todeg ( g ∗ ) .∗ under selective homophily on rich-get-rich mechanism or general mechanism : if c ( m ∗ ) = c ( g ∗ ) , m ∗ joins g ∗ directly; otherwise, m ∗ accepts the connection with probability ρ . If m ∗ does not accept the connection, m ∗ restarts from the beginning of the ConnectionGrowth until a new connection is built.∗ under selective homophily on equal-chance : m ∗ joins g ∗ directly. – ( equal-chance ) with probability − ξ , m ∗ uniformly picks a group g ∗ at random.∗ under selective homophily on rich-get-rich mechanism : m ∗ joins g ∗ directly∗ under selective homophily on equal-chance mechanism or general mechanism : if c ( m ∗ ) = c ( g ∗ ) , m ∗ joins g ∗ directly; otherwise, m ∗ accepts the connection with probability ρ . If m ∗ does not accept the connection, m ∗ restarts from the beginning of the ConnectionGrowth until a new connection is built.
We now provide the formal definition of the glass-ceiling effect and the chasm effect. First note thatthe two subsets of nodes in bipartite networks often represent different entities, and therefore shall beanalyzed separately. For the purpose of this paper, we focus our analysis on the group set, and referto both the tail glass-ceiling effect and the chasm effect as the effects on groups.The tail glass-ceiling effect in bipartite networks captures the phenomenon that groups of oneaffiliation are under-represented among the largest groups. Let top ( G ) k ( R ) (top ( G ) k ( B ) ) be the numberof red (blue) groups that have a size at least k , as t goes to infinity. Definition 4.1. (tail glass-ceiling) A network sequence {N ( M ∪ G, t, Θ) } exhibits a tail glass-ceilingeffect (or glass-ceiling effect for short) against red if there exists an increasing function k ( t ) such that lim t →∞ top ( G ) k ( t ) ( B ) = ∞ , and lim t →∞ top ( G ) k ( t ) ( R ) top ( G ) k ( t ) ( B ) = 0 . (1)The chasm effect captures the phenomenon that the representation of groups of the minorityaffiliation first increases and then decreases, as the group size goes up. Definition 4.2. (chasm) A network sequence {N ( M ∪ G, t, Θ) } exhibits a chasm effect against red ifthere exists K < ∞ such that the ratio of red groups r ( G ) k as a function of k increases for k < K anddecreases for k > K .We first note that the selective homophily on rich-get-richer mechanism can lead to both the tailglass-ceiling effect and the chasm effect. As we will establish all of the following Lemma results laterin a generalized model, we defer our proofs to corollaries found in Section 5. Lemma 4.1.
Under some conditions of Θ , a network sequence {N ( M ∪ G, t, Θ) } generated by SHMwith the selective homophily on rich-get-richer mechanism exhibits both the tail glass-ceiling effect andthe chasm effect as t goes to infinity. Previous works on uni-partite networks imply that the selective homophily on equal-chance mech-anism cannot lead to the glass-ceiling effects[3]. Indeed, this is also true for bipartite networks.
Lemma 4.2.
A network sequence {N ( M ∪ G, t, Θ) } generated by SHM with selective homophily onthe equal-chance mechanism do not exhibit tail glass-ceiling effect. Lemma 4.3.
A network sequence {N ( M ∪ G, t, Θ) } generated by SHM with the general mechanismdo not exhibit chasm effect. So far, we have shown that having the selective homophily model is necessary for both the tailglass-ceiling effect and the chasm effect. We have also seen that selective homophily on the equal-chance mechanism does not help create the glass-ceiling effect either. Moreover, having the samelevel of homophily on the rich-get-richer mechanism and the equal-chance mechanism would eliminatethe chasm effect. It seems like the equal-chance mechanism is not useful in creating the glass-ceilingand the chasm effects (Figure 6-(d)). However, this is not true. The following corollary shows thatalthough the homophily on equal-chance mechanism is not necessary for either effect to emerge, theequal-chance mechanism itself is needed to have the chasm effect.
Lemma 4.4.
A sequence of networks N ( M ∪ G, t, Θ) generated by SHM without the equal-chancemechanism do not exhibit chasm effect. Therefore, the equal-chance mechanism is also a necessary mechanism in creating both effects. Weconclude the above findings in Table 1.
Theorem 4.1.
The selective homophily on rich-get-richer mechanism and the equal-chance mechanismare both necessary mechanisms for networks generated through the SHM to exhibit both the tail glass-ceiling effect and the chasm effect.
Intuitively, the equal-chance mechanism gives small groups chances to grow, and having homophilyon rich-get-richer mechanism allows majorities to grow large groups. Under the selective homophilyon rich-get-richer mechanism, because there are more majority groups, minorities are less likely to joingroups through the rich-get-richer mechanism. Instead, they grow smaller groups. In a long run, thereare more minority groups with middle sizes; when there is no equal-chance mechanism, small groups10o not have the chance to grow, and therefore the network does not have the chasm effect; under theselective homophily on the equal-chance mechanism, because there is no homophily on rich-get-richermechanism, majorities do not have the chance to grow large groups, and therefore, there is no glass-ceiling effect; under the general homophily mechanism, small blue groups grow no less than small redgroups, and thus do not exhibit the chasm effect. If we allow different homophily levels for the majorityand the minority, it is possible for small red groups to grow faster than blue groups. We will see morein the next section. The interaction among the mechanisms in the real world is undoubtedly morecomplex, but we hope the above intuition could offer a more profound understanding of the drivingmechanisms of social disparities.
We now extend the SHM with selective homophily on rich-get-richer mechanism to a new model thatserves two purposes: first, it can still capture both the glass-ceiling effect and the chasm effect; second,it allows more degrees of freedom, and therefore can reproduce real social networks. In this section,we introduce a generalized model, prove the sufficient and necessary conditions for the two effects tohappen, and reproduce real datasets with the generalized model.For clarity, we list all notations that are used in our theory presentation in Table 2.
The previous analysis on SHM implies that the level of homophily plays an important role in largeblue groups and small red groups’ faster growth rate than the other affiliation. We therefore introducea new generalized selective homophily model (GSHM) with two sets of new parameters: ρ ( u ) r ( ρ ( u ) b )captures the level of red (blue) selective homophily on equal-chance mechanism ; ρ ( p ) r ( ρ ( p ) b ) capturesthe level of red (blue) selective homophily on rich-get-richer mechanism .We now present the generalized model in details. At time t = 2 , we initialize the bipartite networkwith one red member connecting to a red group, and one blue member connecting to a blue group.At time t , the network grows as the following:• Member Growth : – ( minority-majority ) with probability α ( < α < ), a new member m ∗ joins the network,and it is colored red with probability r ( < r ≤ / ); – ( rich-get-richer ) otherwise, with probability − α , we randomly pick an existing member m ∗ with probability proportional to deg ( m ∗ ) .• Group Growth : with probability η ( < η < ), the member creates a group of color c ( m ∗ ) .• Connection Growth : with probability − η , the member m ∗ joins an existing group, accordingto the following two steps: – ( rich-get-richer ) with probability ξ , m ∗ picks a group g ∗ with probability proportional todeg ( g ∗ ) . If c ( m ∗ ) = c ( g ∗ ) , m ∗ joins g ∗ directly; otherwise, m ∗ accepts the connection withprobability ρ ( p ) c ( m ∗ ) . If m ∗ does not accept the connection, m ∗ restarts from the beginningof the Connection Growth until a new connection is built. – ( equal-chance ) with probability − ξ , m ∗ uniformly picks a group g ∗ at random. If c ( m ∗ ) = c ( g ∗ ) , m ∗ joins g ∗ directly; otherwise, m ∗ accepts the connection with probability ρ ( u ) c ( m ∗ ) .If m ∗ does not accept the connection, m ∗ restarts from the beginning of the ConnectionGrowth until a new connection is built.Under GSHM, when a user decides on whether to join a selected group, the probability of acceptingdepends on both the user’s affiliation and the mechanism that the user uses to pick the group. We11eneral notations: c ( x ) color of node x ∈ M ∪ G . deg ( x ) degree of node x ∈ M ∪ G .Group notations: G t ( C ) number of groups in color C at time t . G k,t ( C ) number of groups in color C with size k at time t ; G k ( C ) :=lim t →∞ E ( G k,t ( C ) ) t . r ( G ) t ( C ) group growth rate of color C at time t ; that is, r ( G ) t ( C ) := G t ( C ) t .Member notations: M t ( C ) number of C members at time t . M k,t ( C ) number of members in color C with degree k at time t ; M k ( C ) :=lim t →∞ E ( M k,t ( C ) ) t . M ( k ) t ( C ) number of members in color C that are contained in groups of size k . r ( M,G ) k,t ( C ) ratio of expected number of members in color C that are contained ingroups of size k at time t . r ( M,G ) k,t ( C , C ) ratio of expected members of color C in groups of size k with color C ; r ( M,G ) k ( C , C ) = lim t →∞ r ( M,G ) k,t ( C , C ) .Edges notations: E ( G ) t ( C ) sum of group sizes in color C at time t ; r ( E,G ) t ( R ) := E ( G ) ( R ) t . E ( M ) t ( C ) sum of member degrees in color C at time t ; r ( E,M ) t ( R ) := E ( M ) ( R ) t .Table 2: Notationillustrate this probability specification in Figure 5 - (b). Note that all of the three homophilousmechanisms are special cases of the GSHM.We now mathematically characterize the degree distributions of the two types of nodes in GSHMand provide the sufficient and necessary conditions for the glass-ceiling effect and the chasm effect tohappen. We first investigate the size distributions of the red and blue groups in a bipartite network generatedby the GSHM, and show that the number of red groups of size k , G k ( R ) and the number of bluegroups of size k , G k ( B ) follow power laws under the GSHM model. Theorem 5.1.
Let { N ( M ∪ G, t, Θ } be a sequence of networks produced by the GSHM model. Assumethat ρ ( p ) R , ρ ( p ) B > . The red group-size distribution G k ( R ) and the blue group-size distribution G k ( B ) asymptotically follow the power law distributions; specifically, as t goes to infinity, G k ( R ) ∝ k − β ( R ) , G k ( B ) ∝ k − β ( B ) , (2) with β ( R ) = 1 + C R, and β ( B ) = 1 + C B, , where C R, := r (1 − η ) ξ − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) + (1 − r )(1 − η ) ρ ( p ) B ξ − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r , (3) C B, := (1 − r )(1 − η ) ξ − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r + r (1 − η ) ρ ( p ) R ξ − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) , (4)12 nd α ∗ is the unique number ∈ (0 , satisfying α ∗ = rη + r (1 − η )( ξα ∗ + (1 − ξ ) r )1 − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) + (1 − r )(1 − η )( ρ ( p ) B ξα ∗ + ρ ( u ) B (1 − ξ ) r )1 − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r . (5) Proof.
The proof sketch employs similar recursive techniques to Theorem 4.12 in[3]. One challengingstep in our setting, is to show that r ( E,G ) t ( R ) → α ∗ almost surely as t → ∞ . In[3] this is provedby constructing a Doob Martingale Φ t := E [ T r ( E,G ) T ( R ) | F t ] for ≤ t ≤ T , and showing that | Φ t − Φ t − | could have a nice bound, and hence by Azuma’s inequality of martingale one can derivea concentration inequality for α t . This method does not work in our case, since our model is morecomplicated and we do not have a direct bound of | Φ t − Φ t − | . Instead, to overcome this, we prove that Z t := ( r ( E,G ) t ( R ) − α ∗ ) is an almost supermartingle (introduced in[22]), which converges to a limitrandom variable Z ∞ almost surely. We then prove that lim t →∞ E [ Z t ] = 0 , which implies that Z ∞ mustequal almost surely, and gives our desired result. We delay the detailed proof to the appendix. We can use similar strategies to show that the member degrees also follow power-laws with the samepower.
Theorem 5.2. (proof in appendix) Let { N ( M ∪ G, t, Θ } be a sequence of networks produced by GSHM.The red member-degree distribution M k ( R ) and the blue member-degree distribution M k ( B ) asymptot-ically follow the power law distributions with the same power; specifically, as t goes to infinity, M k ( R ) ∝ k − ( − α ) , M k ( B ) ∝ k − ( − α ) . (6) The existence of tail glass-ceiling follows directly from Theorem 5.1.
Corollary 5.1.
Let { N ( M ∪ G, t, Θ } be a sequence of networks produced by GSHM. Let β ( R ) , β ( B ) be as defined in Theorem 5.1. Then • when β ( R ) < β ( B ) , { N ( M ∪ G, t, Θ } exhibits tail glass-ceiling effect against the blue groups. • when β ( R ) > β ( B ) , { N ( M ∪ G, t, Θ } exhibits tail glass-ceiling effect against the red groups. • when β ( R ) = β ( B ) or ξ = 0 , { N ( M ∪ G, t, Θ } the network does not exhibit tail glass-ceilingeffect.Proof. Assume β ( R ) < β ( B ) . Let k ( n ) := n β ( B ) . Then E [ top Gk ( B )] = n (cid:88) k (cid:48) ≥ k G k (cid:48) ( B ) = O (cid:18) n · n − β ( B ) β ( B ) (cid:19) = O (1); (7)and, we have for an (cid:15) > , E [ top Gk ( R )] = n (cid:88) k (cid:48) ≥ k G k (cid:48) ( R ) = Ω (cid:18) n · n − β ( R ) β ( B ) (cid:19) = Ω (cid:18) n − β ( R ) β ( B ) (cid:19) = Ω( n (cid:15) ) . (8) Corollary 5.2.
A network sequence {N ( M ∪ G, t, Θ) } generated by SHM with selective homophily onequal-chance leads no tail glass-ceiling effect for groups.Proof. SHM with selective homophily on equal-chance implies ρ ( u ) R = ρ ( u ) B and ρ ( p ) R = ρ ( p ) B = 1 ,which yields C R, = C B, = r (1 − η ) ξ − (1 − ρ )(1 − ξ )(1 − r ) + (1 − r )(1 − η ) ξ − (1 − ρ )(1 − ξ ) r . (9)13 .3.2 Chasm We are now ready to prove the first result on monotonicity of minority ratio change in homophilousnetworks, from a novel analysis of the distribution. Suppose a network produced by the GSHM hastail glass-ceiling effect against red groups, the following theorem provides the necessary and sufficientcondition for the chasm effect to happen.
Theorem 5.3.
Following the same notations as in Theorem 5.1. Assume C R, < C B, . then the groupratio sequence { G k ( R ) /G k ( B ) , k ≥ } has the chasm effect against red, if and only if k ∗ > , where k ∗ := (1 + C R, )(1 + C B, ) − (1 + C R, )(1 + C B, ) C R, − C B, , (10) Where C R, := r (1 − η )(1 − ξ ) η − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) + (1 − r )(1 − η ) ρ ( u ) B (1 − ξ ) η − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r ; (11) C B, := (1 − r )(1 − η )(1 − ξ ) η − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r + r (1 − η ) ρ ( u ) R (1 − ξ ) η − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) . (12) Moreover, when k ∗ > , the monotonicity of { G k ( R ) /G k ( B ) , k ≥ } changes at [ k ∗ ] , which is thelargest integer smaller than k ∗ .Proof. We first define g ratio ( k ) := G k ( R ) /G k ( B ) G k − ( R ) /G k − ( B ) = 1 − C R, kC R, + C R, − C B, kC B, + C B, . (13)To see the monotonicity of { G k ( R ) /G k ( B ) } , it is sufficient to compare g ratio ( k ) with 1. Note that, g ratio ( k ) > ⇔ C R, kC R, + C R, < C B, kC B, + C B, . (14)With some algebra, we have that C R, kC R, + C R, − C B, kC B, + C B, = ( k − k ∗ )( C B, − C R, )(1 + kC R, + C R, )(1 + kC B, + C B, ) . (15)Since the denominator is positive and C B, − C R, > , we therefore have that g ratio ( k ) > for k < k ∗ ,and g ratio ( k ) < for k > k ∗ . When k ∗ < , g ratio ( k ) > for all k > , and therefore is monotonicallyincreasing. Corollary 5.3.
Following the notation defined in Theorem 5.1 and Theorem 5.3. A network sequence {N ( M ∪ G, t, Θ) } produced by GSHM has a group chasm effect against the red groups if and only if C R, < C B, and k ∗ > . Corollary 5.4.
A network sequence {N ( M ∪ G, t, Θ) } generated by SHM with the general homophilymechanism leads to no chasm effect.Proof. The general selective homophily is equivalent to setting ρ ( u ) r = ρ ( p ) r = ρ ( u ) b = ρ ( p ) b in the GSHM.It is easy to see that, for some positive constant γ > , we have that C R, = γC R, , C B, = γC B, .Substituting this relation into the expression for k ∗ , we have that k ∗ = (1 + C R, )(1 + γC B, ) − (1 + C B, )(1 + γC R, ) C R, − C B, = 1 − γ < , (16)14 orollary 5.5. A network sequence {N ( M ∪ G, t, Θ) } generated by SHM with no equal-chance mech-anism in the model leads to no chasm effect.Proof. Removing the oppotunity mechanism from SHM is equivalent to setting ξ = 1 in the GSHM.It is easy to check that C R, = C B, = 0 , and thus k ∗ = 1 . So far, our analysis on bipartite networks focuses mainly on groups. We have observed in Section 3.3that the average member ratio in groups with a fixed size is also non-monotone. The following lemmacalculates the average red member ratio among groups of size 1, and that among groups of size goingto infinity. When both values are below r , we can say that the member ratio is non-monotone. Lemma 5.1.
For the red member ratios within groups with size 1, and within groups with size goes toinfinity, we have: • For groups with size 1, lim t →∞ r ( M,G )1 ,t ( R ) = G ( R ) G ( R ) + G ( B ) = 1 + C B, + C B, C R, + C R, + C B, + C B, . (17)(18)• For groups with size goes to infinity, assume C R, < C B, , lim k →∞ lim t →∞ r ( M,G ) k,t ( R ) = r ( M,G ) ( R ) , (19) where r ( M,G ) is defined as r ( M,G ) ( R ) = q RB q RB + q BB , (20) with q RB = rρ ( p ) R (cid:16) − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r (cid:17) , (21) q BB = (1 − r ) (cid:16) − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) (cid:17) . (22) Proof.
Due to the space limit, we provide a sketch proof here, and the detailed proof is in the AppendixF. For groups with size 1, since there is exactly 1 red (blue) member in every red (blue) groups withsize 1, the ratio r ( M,G )1 ,t ( R ) is the same as the ratio of red groups among all groups with size 1, whichgoes to G ( R ) / ( G ( R ) + G ( B )) . For groups with size goes to infinity, we first prove (in Lemma F.1)that, the portion of red members in red (blue) groups with size k , goes to p RR,k (and p RB,k ), whoseexpressions are provided in Lemma F.1. Under the condition C R, < C B, , blue groups dominate thelarge size groups, and thus r ( M,G ) ( R ) should be the limit of p RB,k as k goes to infinity, which weanalyze in Appendix F. In the previous sections, we have noticed that all the real-data observations we present in Section3.2 and 3.3 may be present in networks generated by GSHM. Then a natural question to ask is: canGSHM simulate the chasm effect and the glass-ceiling effect of real social networks? We illustrate itsperformance in terms of its capability of reproducing the chasm effect and the glass-ceiling effects fromreal social networks.To do so, we first need to infer parameters from the real dataset. The minority-majority ratio r ,the member growth rate α , and the group growth rate γ can be directly calculated from the dataset.Note that α ∗ , defined in (62) and r ( M,G ) , defined in (116) can be also inferred from the dataset. We15igure 7: Model fits: our simulated data captures both the glass-ceiling effect and the chasm. It alsocaptures the distribution of group/member counts for different group sizes (radius of yellow circles).then can solve a system of equations for ξ, ρ ( p ) R , ρ ( p ) B , ρ ( u ) R , ρ ( u ) B from the five equations in (45),(46), and(119).We test this inference assumption with the QQ dataset, by simulating 10 networks of size 50,000using the inferred parameters, and take the mean of group ratios and member ratios, as well as theaverage counts for groups of a certain size. We present our result in Figure 7. For the female-dominatedgroup ratio, we see that it demonstrates both the glass-ceiling effect and the chasm effect. Moreover,our simulation locates the group size where the monotonicity of the ratio changes. This re-confirmsour calculation in Theorem 5.3. For the average female member ratio, we see that it again exhibitsboth the glass-ceiling effect and the chasm effect. However, it does not locate wehre the monotonicitychanges. We are not surprised by this inaccuracy, as our generalized model only extend SHM byallowing different homophily levels, and we expect real social networks to be more complicated. For abetter performance, a more complex model may be needed. The presence of hegemony in networks have already been linked to important consequences on thefairness of many graph algorithms[23]. We now present two examples where our identified chasm effect,which contrasts with the tail effect, invites us to shed light on the fairness of targeted advertisementand content moderation.
The nature of classified ads went through a seismic shift with the advent of Craigslist. Employersposted job opportunities online, providing an additional advantage to people with access to com-puters and good internet connections. In the last two decades, recruitment strategies evolved further;prospective employees are targeted on LinkedIn or Facebook based on self-uploaded profiles. But, theseapproaches can be exclusionary or discriminatory - perhaps inadvertently - and expensive. Nowadays,recruiters often post job openings in large interest groups on social networks as a way to organicallyreach a larger, more diverse audience without paying a premium for targeted advertising or job boards.However, since the make-up of groups is not uniform, the strategy adopted when can impact the diver-sity of candidates applying to open positions. To cast a wide net, the focus may be on larger groups,which may increase the gender imbalance. However, the chasm effect shows that there is a group-sizethreshold that, if adopted, can help ensure a more diverse net is cast with the job posting reachingmore women. Acknowledging the existence of this threshold and attempting to determine the optimalthreshold could go a long way in reducing the implicit biases in the hiring process.In detail, consider the advertising strategy that places ads for groups with size greater than orequal to k A . Let r ( A ) ( k A ) be the ratio of red members among all the members seeing the ads, in thelimit t → ∞ . We have the following theorem, whose proof is delayed to Appendix C.16 heorem 6.1. Assume the red member ratios for very small and large groups are smaller than theaverage red member ratio r in the network. There exist < k lowerA ≤ k upperA , such that • For k A > k upperA , r ( A ) ( k A ) < r ; • For k A < k lowerA , r ( A ) ( k A ) > r . We examine this result empirically on the QQ dataset, and we see (in Figure 8) that we can choose k lowerA = k upperA = 63 . That is, if the group-size threshold is larger than 63, the advertising strategyfavors males; on the other hand, if it is less than or equal to 63, the advertising strategy favors females.Figure 8: Advertisement simulations on QQ: consider the advertising strategy that places ads to groupslarger than a threshold. We see as the threshold gets lower, the advertising strategy changes fromfavoring males to females in terms of member exposures. The conditions of the information landscape have deteriorated significantly over the course of the lastdecade. Conspiracy theories, false rumors about people and events, and hateful content have all beenamplified. Group chats and end-to-end encrypted chats have not escaped this fate. They are rampantwith the same malcontent and have the added problem where a lot of conversations are not subjectto scrutiny. One of the many approaches to address this ecosystem is to rely on fact-checkers whoidentify pieces of information to verify and provide in-depth analysis into their veracity. Fact-checkingorganizations scour different parts of the open web and social platforms to build up their database, andmany have also set up additional tip-lines as one force to counteract the widespread misinformation.Different fact-checking organizations have different strategies in terms of prioritizing what to fact-check. Typically, it is based on a combination of importance (e.g. elections), relevance (e.g. breakingnews events), the number of types an individual piece of content has been flagged, and the number ofplatforms on which it has been flagged.In a highly simplified scenario where people have an equal tendency to report fake news whenthey see it, and the fact-checkers always prioritize to check news with more reports, one could askthe question whether prioritizing based on the number of reports is fair. As news from larger groupsis more likely to be checked, the glass-ceiling effect implies that relative to the majority, fake newsthat originates or spread among minority members might be less likely to be detected and removed;however, the chasm effect shows that this is not necessarily true.Assume that the probability of fake news being detected in a group depends on the group size andthe likelihood of all pieces of malcontent being detected. For simplicity, let θ ∈ [0 , be the strengthof the detector, with θ = 1 indicating all fake news will be detected, and θ = 0 indicating nothingwill be flagged for a fact-check. For each group with size k , denote h ( k, θ ) as the probability that fakenews in the group is detected. Equivalently, h ( k, θ ) is the expected ratio of detected fake news overall fake news in the group. We assume the function h ( · , · ) satisfies:17. h ( · , · ) is monotone increasing in group size: h ( k, θ ) < h ( k + 1 , θ ) .2. h ( · , · ) is monotone increasing in detecting strength: h ( k, θ ) < h ( k, θ ) for ≤ θ < θ ≤ ;3. h ( k,
0) = 0 , h ( k,
1) = 1 , and lim θ → h ( k, θ ) h ( k + 1 , θ ) = 0 , lim θ → − h ( k, θ )1 − h ( k + 1 , θ ) = ∞ . (23)We make the assumption (1) since fake news in larger groups is likely to be reported more times,and therefore has a higher probability to be detected. Assumption (2) makes sense since θ measuresthe detector’s strength. The last assumption (3) is a technical assumption, which means that groupswith larger sizes dominate groups with smaller sizes, in the sense: 1. as the strength of the detectorgoes to 0, h ( k, θ ) goes to 0 faster than h ( k + 1 , θ ) ; 2. as the strength of the detector goes to 1, h ( k, θ ) goes to 1 slower than h ( k + 1 , θ ) .Regard h ( k, θ ) as the protection score of a group with size k , and let r ( D ) ( θ ) be the ratio of redgroups’ scores over total scores, that is, r ( D ) ( θ ) = (cid:80) k ≥ G k ( R ) h ( k, θ ) (cid:80) k ≥ ( G k ( R ) + G k ( B )) h ( k, θ ) . (24)We then have the following theorem, whose proof is deferred to XXX. Theorem 6.2.
Assume the red group ratio G k ( R ) / ( G k ( R ) + G k ( B )) is less than the overall red groupratio r for very small and large groups in the network as t → ∞ . Then there exist < θ lower ≤ θ upper < , such that • For θ > θ upper , r ( D ) ( θ ) > r ; • For θ < θ lower , r ( D ) ( θ ) < r .Proof. We just present a sketch of proof here, and the detailed proof is delayed to Appendix D. As thedetection strength goes to 0, by our assumption (3) on h ( k, θ ) , only large groups matter for r ( D ) ( θ ) ,where the ratio of red groups is less than r . Hence lim θ → r ( D ) ( θ ) < r . Similarly we can show that lim θ → r ( D ) ( θ ) > r .We examine this result empirically in WhatsApp, an end-to-end encrypted chat application, with asimulated fact-checking system. Assume that for fake news to be detected, it first needs to be reportedto a fact-checking organization who will prioritize the fact-check. We assume that the number of reportsreceived in a group of size k follows the Poisson distribution with the parameter being p · k , where p captures the tendency of reporting fake news in the network. Without further assumptions, we set p = 0 . . The fact-checker ranks all the reported pieces of content by volume; if two items have thesame number of reports, the fact-checker ranks the one from the larger group higher. Finally, thefact-checker sets a percentage threshold P to check items ranked within the top P % ranked items. Werepeat this simulation 100 times, and report our findings in Figure 9.Note that the percentage threshold corresponds to the likelihood of all pieces of malcontent beingdetected. We see, in Figure 9 (a) as more fake news is detected, the protection ratio crosses the averageanti-BJP ratio. That is, if the fact-checking organizations focus purely on the volume of reports, itfavors the majority. If there is an opportunity, however, to apply more resources to the fact-checkinginitiatives, the majority is no longer favored. Similar trends are found also for the ratio of numberof times red groups (vs blue groups) are checked, the ratio of the total number of people protectedin red (vs blue groups), and the ratio of the total number of red members (vs blue members) gettingprotected. 18igure 9: Fact-checking simulations on WhatsApp: the glass-ceiling effect indicates that fact-checkersalways protect more majority; however, we see that, as the detection strength gets larger, fact-checkersstart protecting more minority than majority. The graphs formed among us, and the structures of groups and communities connecting every individ-ual, govern how today’s information propagates and gets selectively curated. Bias is quick to emergeand interact with the simplest network primitives as well as more complex algorithmic rules, and thisbias contributes to unequal opportunity among genders or disproportionate effects along political lines.Our results confirm that homophilous and rich get richer dynamics in the graph itself play a critical rolein shaping the bias observed among multiple domains, paving the way for finding a common ground tocounteract observed disparities. As our theoretical results suggest and empirical results confirm, thebias inside the tail or within the bulk of a popularity distribution can widely vary in orientation. Werefer to this as a chasm between seemingly opposing views, but explain that its causes are not alwaysin disparate treatment but may be simple systemic effects of selective homophily. This observationis critical as previous predictions of algorithmic bias on the tail are sometimes diametrically opposedto the case when a similar metric is examined at the lower end, including when selecting items forfact-checking or choosing groups for targeted advertisement.To keep our model generally applicable, we focused on the most commonly found dynamics (op-portunity and rich-get-richer) which spans a range where popularity either plays no role or is entirelyresponsible for growth. This allowed us to identify the necessary and sufficient conditions for theobserved chasm to emerge, but that remains a crude unifying model that leaves many domain spe-cific effects aside. We hope that our results encourage a renewed interest in a holistic view of eitherequitable representation or fairness guarantees for online content moderation. While each of thoseapplications is beyond the scope of this paper, the empirical presence of a chasm and our simulationsalready suggests that, in order to achieve this goal, a new analysis beyond a narrow focus on tail effectsis critical. 19 cknowledgement
We would like to express our appreciation to Dr. Kiran Garimella and Prof. Dead Eckles fromMassachusetts Institute of Technology for their generous help with the collection of the WhatsAppdata. We would also like to thank Archis Chowdhury from BOOM for sharing his fact-checkingexperience with us.
References [1] David A Cotter, Joan M Hermsen, Seth Ovadia, and Reeve Vanneman. The glass ceiling effect.
Social forces , 80(2):655–681, 2001.[2] Laurie A Morgan. Glass-ceiling effect or cohort effect? a longitudinal study of the gender earningsgap for engineers, 1982 to 1989.
American sociological review , pages 479–493, 1998.[3] Chen Avin, Barbara Keller, Zvi Lotker, Claire Mathieu, David Peleg, and Yvonne-Anne Pignolet.Homophily and the glass ceiling effect in social networks. In
Proceedings of the 2015 conferenceon innovations in theoretical computer science , pages 41–50, 2015.[4] Zhaopeng Qu and Zhong Zhao. Glass ceiling effect in urban china: Wage inequality of rural-urbanmigrants during 2002–2007.
China Economic Review , 42:118–144, 2017.[5] Ana-Andreea Stoica, Christopher Riederer, and Augustin Chaintreau. Algorithmic glass ceilingin social networks: The effects of social recommendations on network diversity. In
Proceedings ofthe 2018 World Wide Web Conference , pages 923–932, 2018.[6] Matthieu Latapy, Clémence Magnien, and Nathalie Del Vecchio. Basic notions for the analysis oflarge two-mode networks.
Social networks , 30(1):31–48, 2008.[7] Stephen P Borgatti and Martin G Everett. Network analysis of 2-mode data.
Social networks ,19(3):243–270, 1997.[8] Albert-Laszlo Barabâsi, Hawoong Jeong, Zoltan Néda, Erzsebet Ravasz, Andras Schubert, andTamas Vicsek. Evolution of the social network of scientific collaborations.
Physica A: Statisticalmechanics and its applications , 311(3-4):590–614, 2002.[9] Rashmi Pankajai Bomiriya. Topics in exponential random graph modeling. 2014.[10] Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. Information credibility on twitter. In
Proceedings of the 20th international conference on World wide web , pages 675–684, 2011.[11] Vahed Qazvinian, Emily Rosengren, Dragomir Radev, and Qiaozhu Mei. Rumor has it: Identifyingmisinformation in microblogs. In
Proceedings of the 2011 Conference on Empirical Methods inNatural Language Processing , pages 1589–1599, 2011.[12] Minyoung Huh, Andrew Liu, Andrew Owens, and Alexei A Efros. Fighting fake news: Image splicedetection via learned self-consistency. In
Proceedings of the European Conference on ComputerVision (ECCV) , pages 101–117, 2018.[13] Yang Liu and Yi-Fang Brook Wu. Early detection of fake news on social media through prop-agation path classification with recurrent and convolutional networks. In
Thirty-Second AAAIConference on Artificial Intelligence , 2018.[14] Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. Fake news detection on socialmedia: A data mining perspective.
ACM SIGKDD explorations newsletter , 19(1):22–36, 2017.[15] Vanessa Wei Feng and Graeme Hirst. Detecting deceptive opinions with profile compatibility. In
Proceedings of the Sixth International Joint Conference on Natural Language Processing , pages338–346, 2013. 2016] Mahmoudreza Babaei, Abhijnan Chakraborty, Juhi Kulshrestha, Elissa M Redmiles, MeeyoungCha, and Krishna P Gummadi. Analyzing biases in perception of truth in news stories and theirimplications for fact checking. In
Proceedings of the Conference on Fairness, Accountability, andTransparency , pages 139–139, 2019.[17] Zhi-Qiang You, Xiao-Pu Han, Linyuan Lü, and Chi Ho Yeung. Empirical studies on the networkof social groups: the case of tencent qq.
PLoS One , 10(7):e0130538, 2015.[18] Kiran Garimella and Dean Eckles. Images and misinformation in political groups: Evidence fromwhatsapp in india. arXiv preprint arXiv:2005.09784 , 2020.[19] Kiran Garimella and Gareth Tyson. Whatsapp, doc? a first look at whatsapp public group data. arXiv preprint arXiv:1804.01473 , 2018.[20] Lada A Adamic, Bernardo A Huberman, AL Barabási, R Albert, H Jeong, and G Bianconi.Power-law distribution of the world wide web. science , 287(5461):2115–2115, 2000.[21] Miller McPherson, Lynn Smith-Lovin, and James M Cook. Birds of a feather: Homophily insocial networks.
Annual review of sociology , 27(1):415–444, 2001.[22] Herbert Robbins and David Siegmund. A convergence theorem for non negative almost super-martingales and some applications. In
Optimizing methods in statistics , pages 233–257. Elsevier,1971.[23] Ana-Andreea Stoica, Jessy Xinyi Han, and Augustin Chaintreau. Seeding network influence inbiased networks and the benefits of diversity. In
Proceedings of The Web Conference 2020 , pages2089–2098, 2020.[24] Fan Chung, Fan RK Chung, Fan Chung Graham, Linyuan Lu, Kian Fan Chung, et al.
Complexgraphs and networks . Number 107. American Mathematical Soc., 2006.[25] Noga Alon and Joel H Spencer.
The probabilistic method . John Wiley & Sons, 2004.[26] Stéphane Boucheron, Gábor Lugosi, and Pascal Massart.
Concentration inequalities: Anonasymptotic theory of independence . Oxford university press, 2013.21
Proof of Theorem 5.1
Theorem 5.1.
Let { N ( M ∪ G, t, Θ } be a sequence of networks produced by the BGMG model.Assume that ρ ( p ) R , ρ ( p ) B > . The red group-size distribution G k ( R ) and the blue group-size distribution G k ( B ) asymptotically follow the power law distributions; specifically, as t goes to infinity, G k ( R ) ∝ k − β ( R ) , G k ( B ) ∝ k − β ( B ) , (25)with β ( R ) = 1 + C R, and β ( B ) = 1 + C B, , where C R, := r (1 − η ) ξ − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) + (1 − r )(1 − η ) ρ ( p ) B ξ − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r , (26) C B, := (1 − r )(1 − η ) ξ − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r + r (1 − η ) ρ ( p ) R ξ − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) , (27)and α ∗ is the unique number ∈ (0 , satisfying α ∗ = rη + r (1 − η )( ξα ∗ + (1 − ξ ) r )1 − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) + (1 − r )(1 − η )( ρ ( p ) B ξα ∗ + ρ ( u ) B (1 − ξ ) r )1 − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r . (28) Proof.
We develop a recurrence for E ( G k,t ( R )) . First, define p RRt ( k ) := P ( a red member joins a red group with size k at time t ) , (29) p BRt ( k ) := P ( a blue member joins a red group with size k at time t ) . (30)By our construction of the model, it is easy to check that, p RRt ( k ) =( αr + (1 − α ) r ( E,M ) t ( R ))(1 − η ) ξ kt + (1 − ξ ) G t ( R )+ G t ( B ) − (1 − ρ ( p ) R ) ξr ( E,G ) t ( B ) − (1 − ρ ( u ) R )(1 − ξ ) G t ( B ) G t ( R )+ G t ( B ) , (31) p BRt ( k ) =( α (1 − r ) + (1 − α )(1 − r ( E,M ) t ( R )))(1 − η ) ρ ( p ) B ξ kt + ρ ( u ) B (1 − ξ ) G t ( R )+ G t ( B ) − (1 − ρ ( p ) B ) ξr ( E,G ) t ( R ) − (1 − ρ ( u ) B )(1 − ξ ) G t ( R ) G t ( R )+ G t ( B ) . (32)Note that a red group of degree k at time t + 1 could have arisen from three scenarios:1. at time t , it was a red group of size k , and no new member joins at time t + 1 ;2. at time t , it was a red group of size k − , and a new member joins at time t + 1 ;3. in the special case of k = 1 , a red group did not exist at time t can appear if a red person createsit.Therefore, E ( G k,t +1 ( R ) |F t ) = G k,t ( R ) (cid:0) − p RRt ( k ) − p BRt ( k ) (cid:1) (33) + G k − ,t ( R ) (cid:0) p RRt ( k −
1) + p BRt ( k − (cid:1) , (34)where F t is the σ -field containing the information of the graph until time t . Note that p RRt ( k ) + p BRt ( k ) = A t ( R ) k + B t ( R ) t , (35)22 t ( R ) := ( αr + (1 − α ) r ( E,M ) t ( R ))(1 − η ) ξ − (1 − ρ ( p ) R ) ξr ( E,G ) t ( B ) − (1 − ρ ( u ) R )(1 − ξ ) G t ( B ) G t ( R )+ G t ( B ) (36) + ( α (1 − r ) + (1 − α )(1 − r ( E,M ) t ( R )))(1 − η ) ρ ( p ) B ξ − (1 − ρ ( p ) B ) ξr ( E,G ) t ( R ) − (1 − ρ ( u ) B )(1 − ξ ) G t ( R ) G t ( R )+ G t ( B ) , (37) B t ( R ) := ( αr + (1 − α ) r ( E,M ) t ( R ))(1 − η )(1 − ξ ) tG t ( R )+ G t ( B ) − (1 − ρ ( p ) R ) ξr ( E,G ) t ( B ) − (1 − ρ ( u ) R )(1 − ξ ) G t ( B ) G t ( R )+ G t ( B ) (38) + ( α (1 − r ) + (1 − α )(1 − r ( E,M ) t ( R )))(1 − η ) ρ ( u ) B (1 − ξ ) tG t ( R )+ G t ( B ) − (1 − ρ ( p ) B ) ξr ( E,G ) t ( R ) − (1 − ρ ( u ) B )(1 − ξ ) G t ( R ) G t ( R )+ G t ( B ) . (39)We then have E ( G k,t +1 ( R ) |F t ) = G k,t ( R ) (cid:18) − A t ( R ) k + B t ( R ) t (cid:19) + G k − ,t ( R ) A t ( R )( k −
1) + B t ( R ) t . (40)When k = 1 , taking the probability of a red group being created into consideration, we have E ( G ,t +1 ( R ) |F t ) = G ,t (cid:18) − A t ( R ) + B t ( R ) t (cid:19) + α · r · η + (1 − α ) · r ( E,M ) t ( R ) · η. (41)By lemma E.1, We can show that lim t →∞ A t ( R ) = C R, , lim t →∞ B t ( R ) = C R, , a.s, (42)where C R, := r (1 − η ) ξ − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) + (1 − r )(1 − η ) ρ ( p ) B ξ − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r , (43) C R, := r (1 − η )(1 − ξ ) η − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) + (1 − r )(1 − η ) ρ ( u ) B (1 − ξ ) η − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r . (44)By Lemma A.1, G k ( R ) has the following expressions: G ( R ) = rη C R, + C R, , G k ( R ) = G k − ( R ) ( k − C R, + C R, kC R, + C R, ∀ k ≥ . (45)This completes the proof for G k ( R ) , and we can use the same strategy for G k ( B ) , and show that G ( B ) = (1 − r ) ρ B η C B, + C B, , G k ( B ) = G k − ( B ) ( k − C B, + C B, kC B, + C B, ∀ k ≥ , (46)where C B, := (1 − r )(1 − η ) ξ − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r + r (1 − η ) ρ ( p ) R ξ − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) , (47) C B, := (1 − r )(1 − η )(1 − ξ ) η − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r + r (1 − η ) ρ ( u ) R (1 − ξ ) η − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) . (48)Using the same argument of the proof of[3, Theorem 4.12] completes the proof of the power lawresults. 23 emma A.1. [24, Lemma 3.1] Let ( a t ) , ( b t ) , ( c t ) be three sequences such that a t +1 = (cid:0) − b t t (cid:1) a t + c t , lim t →∞ b t = b > , and lim t →∞ c t = c . Then lim t →∞ a t /t exists and its value is lim t →∞ a t t = c b . (49) B Proof of Theorem 5.2
Lemma 5.2.
Let { N ( M ∪ G, t, Θ } be a sequence of networks produced by the BGMG model. The redmember-degree distribution M k ( R ) and the blue member-degree distribution M k ( B ) asymptoticallyfollow the power law distributions with the same power; specifically, as t goes to infinity, M k ( R ) ∝ k − ( − α ) , M k ( B ) ∝ k − ( − α ) . (50) Proof.
For any k > , a red member of degree k at time t + 1 could have arisen from two scenarios:1. at time t , it was a red member of degree k , and not chosen at time t + 1 ;2. at time t , it has size k = 1 and chosen.Thus, E ( M k,t +1 ( R ) |F t ) = M k,t ( R ) (cid:18) − (1 − α ) · kt (cid:19) + M k − ,t ( R ) (cid:18) (1 − α ) · k − t (cid:19) . (51)When k = 1 , a red member of degree at time t + 1 could have arisen from:1. at time t , it was a red member of degree , and not chosen at time t + 1 ;2. a new member joins the network at time t . E ( M ,t +1 ( R ) |F t ) = M ,t ( R ) (cid:18) − (1 − α ) · t (cid:19) + α · r. (52)Therefore, M k ( R ) has the following expressions: M ( R ) = α · r − α , and M k ( R ) = M k − ( R ) (1 − α )( k − − α ) · k . (53)Hence, M k ( R ) ∝ k − ( − α ) . Exactly same argument holds for M k ( B ) . C Proof of Theorem 6.1
Theorem 6.1.
Assume the red member ratios for very small and large groups are smaller than theaverage red member ratio r in the network. There exist < k lowerA ≤ k upperA , such that• For k A > k upperA , r ( A ) ( k A ) < r ;• For k A < k lowerA , r ( A ) ( k A ) > r . Proof.
Under our assumption, there exists some < k lowerA ≤ k upperA , such that lim t →∞ r ( M,G ) k,t < r for k > k upperA and k < k upperA . Therefore, if k A > k upperA , for all groups where ads are placed, their limitingred member ratios are less than r . Consequently, we must have r ( A ) ( k A ) < r . On the other hand, if k A < k lowerA , for the groups where ads are not placed, their limiting red member ratios are less than r , which means that among all the people not seeing the ads, the red member ratio is less than r . Itfurther implies that among all the people seeing the ads, red member ratio is greater than r , that is, r ( A ) ( k A ) > r . 24 Proof of Theorem 6.2
Theorem 6.2.
Assume the red member ratios for very small and large groups are smaller than theaverage red member ratio r in the network. There exists < θ lower < θ upper < , such that• For θ > θ upper , r ( D ) ( θ ) > r ;• For θ < θ lower , r ( D ) ( θ ) < r . Proof.
Under our assumption, there exist some < k lowerF < k upperF , such that for k > k upperF and k < k upperF G k ( R ) G k ( R ) + G k ( B ) < r. As θ → , by the assumption (23), we see that lim θ → (cid:80) k ≤ k upperF G k ( R ) h ( k, θ ) (cid:80) k>k upperF G k ( R ) h ( k, θ ) = 0 , lim θ → (cid:80) k ≤ k upperF G k ( B ) h ( k, θ ) (cid:80) k>k upperF G k ( B ) h ( k, θ ) = 0 , (54)which implies that lim θ → r ( D ) ( θ ) /r k upperF ( θ ) = 1 , where r k upperF ( θ ) := (cid:80) k>k upperF G k ( R ) h ( k, θ ) (cid:80) k>k upperF ( G k ( R ) h ( k, θ ) + G k ( B ) h ( k, θ )) < r. (55)Hence we see that there exists θ lower > , such that r ( D ) ( θ ) < r for θ < θ lower .As θ → , by the assumption (23), we see that lim θ → (cid:80) k ≥ k lowerF G k ( R )(1 − h ( k, θ )) (cid:80) k
Lemma E.1.
Under the assumption that ρ ( p ) R , ρ ( p ) B > , we have the following convergence results: • The proportion of edges coming from red members converges; that is lim t →∞ r ( E,M ) t ( R ) = r a.s. (60)• The ratio of red group counts over t converges; that is lim t →∞ r ( G ) t ( R ) = rη a.s. (61)25 The proportion of edges coming from red groups converges; that is lim t →∞ r ( E,G ) t ( R ) = α ∗ a.s. (62) where α ∗ is the unique number ∈ (0 , satisfying α ∗ = rη + r (1 − η )( ξα ∗ + (1 − ξ ) r )1 − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) + (1 − r )(1 − η )( ρ ( p ) B ξα ∗ + ρ ( u ) B (1 − ξ ) r )1 − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r . (63)We divide the proof into three parts. Part 1. Proof of (60)
Note that E ( M ) t ( R ) is the total degree of red nodes at time t . By ourmodel, given r ( E,M ) t ( R ) , the total degree of red nodes at time t + 1 could take two values: E ( M ) t ( R ) and E ( M ) t ( R ) + 1 , with probability − αr − (1 − α ) r ( E,M ) t ( R ) and αr + (1 − α ) r ( E,M ) t ( R ) respectively.Recall that F t is the σ − field containing the information of the graph up to time t . Therefore we havethat E (cid:16) E ( M ) t +1 ( R ) |F t (cid:17) = E ( M ) t ( R ) + αr + (1 − α ) r ( E,M ) t ( R ) , (64)which gives E (cid:16) r ( E,M ) t +1 ( R ) − r |F t (cid:17) = t + (1 − α ) t + 1 ( r ( E,M ) t ( R ) − r ) . (65)Recall that our model starts from t = 2 . Therefore E (cid:16) r ( E,M ) t +1 ( R ) − r (cid:17) = t (cid:89) i =2 i + (1 − α ) i + 1 ( r ( E,M )2 ( R ) − r ) = O (cid:32) exp( − t (cid:88) i =2 αi + 1 ) (cid:33) = O ( t − α ) . (66)Next we show a concentration inequality for r ( E,M ) t ( R ) . For T > , we define a Doob martingale, thatfor ≤ t ≤ T , W t := E (cid:16) r ( E,M ) T ( R ) − r |F t (cid:17) = T − (cid:89) i = t i + (1 − α ) i + 1 ( r ( E,M ) t ( R ) − r ) . (67)It satisfies that { W t , ≤ t ≤ T } is a martingale, and W T = r ( E,M ) T ( R ) − r , W = E [ r ( E,M ) T ( R ) − r ] .Next we bound the difference between W t and W t − . We have that W t − W t − = T − (cid:89) i = t i + (1 − α ) i + 1 (cid:18) ( r ( E,M ) t ( R ) − r ) − t − αt ( r ( E,M ) t − ( R ) − r ) (cid:19) . (68)Since E ( M ) t ( R ) could just take two values E ( M ) t − ( R ) and E ( M ) t − ( R ) + 1 , we have that | r ( E,M ) t ( R ) − r ( E,M ) t − ( R ) | = O (1 /t ) . And thus W t − W t − = T − (cid:89) i = t i + (1 − α ) i + 1 O (1 /t ) = O (cid:32) exp( − t (cid:88) i = t αi + 1 ) (cid:33) O (1 /t ) . = O ( t − ( T /t ) − α ) . (69)Applying the Azuma’s inequality[25] for martingale, we get that there exist constants c , c > , suchthat for any T, x > , P ( | W T − W | > x ) ≤ exp (cid:32) − c x T − α (cid:80) Tj =1 t − α (cid:33) ≤ exp (cid:16) − c x T min(1 , α ) / log T (cid:17) , (70)where the last step is because T − α T (cid:88) j =1 t − α = O ( T − α ) , if α < O ( T − log T ) , if α = 1; O ( T − ) , if α > . (71)26rom (70) we have that, for any (cid:15) > , the tail probability P (cid:16) | r ( E,M ) T ( R ) − E [ r ( E,M ) T ( R )] | > (cid:15) (cid:17) = P ( | W T − W | > (cid:15) ) is summable over T . By the Borel Cantelli lemma, we see that r ( E,M ) t ( R ) − E [ r ( E,M ) t ( R )] → a.s.,which gives our desired result with (66). Moreover, since we already show that W = O ( T − α ) we havethat there exist constants c , c > , for any x > , P ( | W T | > x ) ≤ P ( | W T − W | > x − | W | ) ≤ exp (cid:16) − c max( x − | W | , T min(1 , α ) / log T (cid:17) ≤ exp (cid:16) − c x T min(1 , α ) / log T (cid:17) . (72) Part 2. Proof of (61)
According to our model, at each time t , with probability αr ( E,M ) t ( R ) +(1 − α ) r a red member adds an edge, and with probability η the edge is added by creating a newred group. Let’s consider the number of red groups in the model conditioned on a given sequence { r ( E,M ) t ( R ) , t > } .For each t , there are two cases: (1) case 1, E ( M ) t +1 ( R ) = E ( M ) t ( R ) + 1 , in this case a red memberadds an edge at time t , and conditioned on { r ( E,M ) t ( R ) , t > } , the probability that this edge is addedby creating a new red group is η : this is because how this edge is added does not influence the valueof r ( E,M ) t +1 ( R ) and thus does not influence { r ( E,M ) t ( R ) , t > } , and hence whether we condition on { r ( E,M ) t ( R ) , t > } or not does not change the probability that the new edge is added by creating agroup; (2) case 2, E ( M ) t +1 ( R ) = E ( M ) t ( R ) , in this case a blue member adds an edge at time t , and no redgroup is created.We also have that, the events { a red group is created at time t } over different t are independentconditioned on { r ( E,M ) t ( R ) , t > } . Intuitively, it is because the probability of { a red group is createdat time t } only depends on the value of E ( M ) t +1 ( R ) − E ( M ) t ( R ) . The independence claim could also beverified by writing out the posterior distribution of those events given { r ( E,M ) t ( R ) , t > } .Recall that our initial condition is that there is a red (blue) member with an edge to a red (blue)group, in total two members and two groups. Therefore, given { r ( E,M ) t ( R ) , t > } , the number of redgroups G t ( R ) satisfies that, G t ( R ) − follows a Binomial distribution B ( E ( M ) t ( R ) − , η ) . Therefore,by Hoeffding’s inequality ([26]), we have that for any x > , P (cid:32)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) G t ( R ) − E ( M ) t ( R ) − − η (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) > x | { r ( E,M ) t ( R ) , t > } (cid:33) ≤ (cid:16) − E ( M ) t ( R ) − x (cid:17) , (73)which further implies that P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) r ( G ) t ( R ) − ηr ( E,M ) t ( R ) + η − t (cid:12)(cid:12)(cid:12)(cid:12) > x | { r ( E,M ) t ( R ) , t > } (cid:19) ≤ (cid:32) − x t E ( M ) t ( R ) − (cid:33) ≤ (cid:18) − x t t − (cid:19) . (74) Hence for any (cid:15) > , with probability 1 the tail probability P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) r ( G ) t ( R ) − ηr ( E,M ) t ( R ) + η − t (cid:12)(cid:12)(cid:12)(cid:12) > (cid:15) | { r ( E,M ) t ( R ) , t > } (cid:19) is summable over t . By the Borel Cantelli lemma, we see that r ( G ) t ( R ) − ηr ( E,M ) t ( R ) + ( η − /t goesto 0 a.s., which gives r ( G ) t ( R ) → rη a.s. with the fact that r ( E,M ) t ( R ) → r a.s..Moreover, since by the triangle inequality (cid:12)(cid:12)(cid:12) r ( G ) t ( R ) − rη (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) r ( G ) t ( R ) − ηr ( E,M ) t ( R ) + η − t (cid:12)(cid:12)(cid:12)(cid:12) + 1 − ηt + | ηr ( E,M ) t ( R ) − ηr | , (75)27e see that for any x > , (cid:110)(cid:12)(cid:12)(cid:12) r ( G ) t ( R ) − rη (cid:12)(cid:12)(cid:12) > x (cid:111) ⊂ (cid:26)(cid:12)(cid:12)(cid:12)(cid:12) r ( G ) t ( R ) − ηr ( E,M ) t ( R ) + η − t (cid:12)(cid:12)(cid:12)(cid:12) > x − − ηt (cid:27) ∪ (cid:110) | ηr ( E,M ) t ( R ) − ηr | > x (cid:111) . (76)Therefore, for the unconditional tail probability of r ( G ) t ( R ) − rη , we have P (cid:16)(cid:12)(cid:12)(cid:12) r ( G ) t ( R ) − rη (cid:12)(cid:12)(cid:12) > x (cid:17) ≤ P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) r ( G ) t ( R ) − ηr ( E,M ) t ( R ) + η − t (cid:12)(cid:12)(cid:12)(cid:12) > x − − ηt (cid:19) + P (cid:18) | r ( E,M ) t ( R ) − r | ≥ x η (cid:19) . Note that the unconditional version of (74) also holds, since the right hand side does not depend on { r ( E,M ) t ( R ) , t > } . Together with (72), we have that there exists a constant c > , such that for any x > , P (cid:16)(cid:12)(cid:12)(cid:12) r ( G ) t ( R ) − rη (cid:12)(cid:12)(cid:12) > x (cid:17) ≤ (cid:18) − x/ − (1 − η ) /t ) t t − (cid:19) + exp (cid:32) − c (cid:18) x η (cid:19) t min(1 , α ) / log t (cid:33) ≤ exp (cid:16) − c x t min(1 , α ) / log t (cid:17) . (77) Part 3. Proof of (62) and (63)
Recall that E ( G ) t ( R ) is the total degree of red groups. Similarto part 1, at each time t + 1 , E ( G ) t +1 ( R ) could take two values: E ( G ) t ( R ) and E ( G ) t ( R ) + 1 . By ourdefinition of the model, one can verify that, the probability that E ( G ) t +1 ( R ) = E ( G ) t ( R )+1 is a function of r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B ) , which we denote by H ( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B )) ,and it takes the following expression H ( x, y, z, w ) := ( αr + (1 − α ) y ) η + ( αr + (1 − α ) y )(1 − η )( ξx + (1 − ξ ) zz + w )1 − (1 − ρ ( p ) R ) ξ (1 − x ) − (1 − ρ ( u ) R )(1 − ξ ) ww + z (78) + ( α (1 − r ) + (1 − α )(1 − y ))(1 − η )( ρ ( p ) B ξx + ρ ( u ) B (1 − ξ ) zz + w )1 − (1 − ρ ( p ) B ) ξx − (1 − ρ ( u ) B )(1 − ξ ) zw + z . (79)We already see that r ( E,M ) t ( R ) → r a.s. and r ( G ) t ( R ) → rη a.s. Similarly, r ( G ) t ( B ) → (1 − r ) η a.s. Wedenote F ( x ) = H ( x, r, rη, (1 − r ) η ) (80) = rη + r (1 − η )( ξx + (1 − ξ ) r )1 − (1 − ρ ( p ) R ) ξ (1 − x ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) + (1 − r )(1 − η )( ρ ( p ) B ξx + ρ ( u ) B (1 − ξ ) r )1 − (1 − ρ ( p ) B ) ξx − (1 − ρ ( u ) B )(1 − ξ ) r . (81)We have the following Lemma, whose proof is deferred to Appendix G. Lemma E.2.
Under the assumption that ρ ( p ) R , ρ ( p ) B > , F ( x ) satisfies1. F ( x ) has exactly one fixed point, denoted α ∗ , in [0 , ;2. There exists γ < , such that for any x ∈ (0 , | F ( α ∗ ) − x | ≤ γ | α ∗ − x | . (82)Let α ∗ ∈ (0 , be the number satisfying that F ( α ∗ ) = α ∗ . Similar to part 1, we can calculate thesecond moment of α t +1 − α ∗ E (cid:0) ( α t +1 − α ∗ ) |F t (cid:1) = (cid:32) tr ( E,G ) t ( R ) t + 1 − α ∗ (cid:33) (1 − H ( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B )))+ (cid:32) tr ( E,G ) t ( R ) + 1 t + 1 − α ∗ (cid:33) H ( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B )) (83) = I (1) t + I (2) t + I (3) t , (84)28here I (1) t = t ( r ( E,G ) t ( R ) − α ∗ ) + 2 t ( r ( E,G ) t ( R ) − α ∗ ) (cid:16) (1 − α ∗ ) F ( r ( E,G ) t ( R )) (cid:17) − α ∗ (cid:16) − F ( r ( E,G ) t ( R )) (cid:17) ( t + 1) , (85) I (2) t = ( α ∗ ) (cid:16) − H ( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B )) (cid:17) ( t + 1) + (1 − α ∗ ) H ( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B ))( t + 1) , (86) I (3) t = 2 t ( r ( E,G ) t ( R ) − α ∗ ) (cid:16) (1 − α ∗ )∆( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B ))( t + 1) − α ∗ (1 − ∆( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B ))) (cid:17) ( t + 1) , (87)with ∆( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B )) := H ( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B )) − F ( r ( E,G ) t ( R )) . (88)We need the following lemmas. Lemma E.3.
Under the assumption that ρ ( p ) R , ρ ( p ) B > , there exists c > , such that for any x, y, z, w ∈ (0 , | ∆( x, y, z, w ) | < c ( | y − r | + | z − rη | + | w − (1 − r ) η | ) . (89)We ignore the proof of Lemma E.3, since it could be directly verified by checking that, the firstderivatives of H ( · ) are bounded. Lemma E.4.
We have that, lim T →∞ T (cid:88) t =1 | r ( E,M ) t ( R ) − r | + | r ( G ) t ( R ) − rη | + | r ( G ) t ( B ) − (1 − r ) η | t < ∞ , a.s. (90) and lim T →∞ T (cid:88) t =1 E [ | r ( E,M ) t ( R ) − r | + | r ( G ) t ( R ) − rη | + | r ( G ) t ( B ) − (1 − r ) η | ] t < ∞ . (91)The proof of Lemma E.4 is deferred to Appendix G.Next we bound I (1) t , I (2) t , I (3) t . For I (1) t , by Lemma E.2 and the fact that F ( α ∗ ) = α ∗ , we can havethat I (1) t = t ( r ( E,G ) t ( R ) − α ∗ ) + 2 t ( r ( E,G ) t ( R ) − α ∗ )( F ( r ( E,G ) t ( R )) − α ∗ )( t + 1) (92) ≤ ( r ( E,G ) t ( R ) − α ∗ ) (cid:18) − t (1 − γ )( t + 1) (cid:19) (93)For I (2) t , since H ( · ) is bounded by , obviously for some constant c > , we have I (2) t ≤ c ( t + 1) . (94)With the expression of I (3) t , it is easy to see that for some c > , | I (3) t | < c ∆( r ( E,G ) t ( R ) , r ( E,M ) t ( R ) , r ( G ) t ( R ) , r ( G ) t ( B ))) t . (95)29urther by Lemma E.3 and Lemma E.4, we have that lim T →∞ T (cid:88) t =1 I (3) t < ∞ a.s., and lim T →∞ T (cid:88) t =1 E [ I (3) t ] < ∞ . (96)We need the following Lemma, whose proof is deferred to Appendix G. Lemma E.5.
Let ( a t ) , ( b t ) , ( c t ) be three positive sequences such that a t +1 ≤ b t a t + c t , b t < , lim t →∞ (cid:81) ti =1 b i = 0 , and lim t →∞ (cid:80) ti =1 c i < ∞ . Then lim t →∞ a t = 0 . Let Z t = ( r ( E,G ) t ( R ) − α ∗ ) , a t = E ( Z t ) , b t = 1 − − γ ) t/ ( t + 1) , c t = E [ I (2) t + I (3) t ] . (97)By taking expectation in eq (84), we have that a t +1 ≤ b t a t + c t . It is direct to check the conditions b t < , lim t →∞ (cid:81) ti =1 b i = 0 . By (94) and (96), we have lim t →∞ (cid:80) ti =1 c i < ∞ . And thus from LemmaE.5 we know that lim t →∞ E ( Z t ) = 0 . (98)Since our goal is equivalent to show that Z t → a.s., we claim that it is enough to have that, Z t converges to a limit random variable almost surely as t → ∞ . This is because, assuming that lim t →∞ Z t exists a.s., since Z t is bounded, by the bounded convergence theorem, we have E (lim t →∞ Z t ) = 0 . Since Z t ≥ , its limit must be nonnegative, and therefore lim t →∞ Z t must equal 0 a.s., due to the fact thatits expectation is 0.Now we show that lim t →∞ Z t exists a.s., by checking that { Z t } is an almost supermartingle , sinceby[22], every almost supermartingle converges to a limit random variable almost surely. By[22], tomake { Z t } an almost supermartingle , we just need to check that lim T →∞ (cid:80) Tt =1 I (2) t + I (3) t < ∞ a.s. ,which we have already proved. Therefore the proof is finished. F Proof of Lemma 5.1
Lemma F.1.
We have that lim t →∞ r ( M,G ) k,t ( R, R ) = r ( M,G ) k ( R, R ) := 1 + (cid:80) kj =2 p RR,j k , (99) lim t →∞ r ( M,G ) k,t ( R, B ) = r ( M,G ) k ( R, B ) := 1 + (cid:80) kj =2 p RB,j k , (100) where p RR,j = p (0) RR,j p (0) RR,j + p (0) BR,j , p
RB,j = p (0) RB,j p (0) RB,j + p (0) BB,j (101) p (0) RR,j = r ξj + (1 − ξ ) /η − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) , (102) p (0) BR,j = (1 − r ) ρ ( p ) B ξj + ρ ( u ) B (1 − ξ ) /η − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r , (103) p (0) RB,j = r ρ ( p ) R ξj + ρ ( u ) R (1 − ξ ) /η − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) , (104) p (0) BB,j = (1 − r ) ξj + (1 − ξ ) /η − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r . (105)30 roof. We only prove the result for red groups. For a red group J R with size j at time t , Define theevents Γ t,j := { At time t, an edge between a member and J R is added } (106) Γ t,j,R := { At time t, an edge between a red member and J R is added } , (107) Γ t,j,B := { At time t, an edge between a blue member and J R is added } . (108)We then have that, by the definition of our model and Lemma E.1, P (Γ t,j,R ) · t = ( αr + (1 − α ) r ( E,M ) t ( R ))(1 − η ) (cid:18) ξj + (1 − ξ ) r ( G ) t ( R )+ r ( G ) t ( B ) (cid:19) − (1 − ρ ( p ) R ) ξ (1 − r ( E,G ) t ( R )) − (1 − ρ ( u ) R )(1 − ξ ) r ( G ) t ( B ) r ( G ) t ( R )+ r ( G ) t ( B ) → p (0) RR,j , where the convergence is for t → ∞ . Similarly, we have that P (Γ t,j,B ) · t → p (0) BR,j . (109)By the Bayes formula, we see that as t → ∞ , P (Γ t,j,R | Γ t,j ) = P (Γ t,j,R ) P (Γ t,j,R ) + P (Γ t,j,B ) → p RR,j . (110)We uniformly choose a red group J k,R at time t , among the red groups with size k . Define t < . . . < t k , such that t j is the time a new member M j joins the chosen group J k,R . By theconstruction of our model, we must have that t is the time the group is created, and the first memberis of color red. For each j > , at t j this group has size j − . Note that as the graph size t goes toinfinity, since J k,R is uniformly chosen, we must have that t j → ∞ for each j . Therefore we have that E [ number of red members in J k,R ] → k (cid:88) j =2 p RR,j . (111)Recall that G k,t ( R ) is the number of red groups at time t . Since J k,R is uniformly chosen, we havethat E [ number of red members in red groups with size k ] G k,t ( R ) → k (cid:88) j =2 p RR,j , (112)which finishes the proof with the fact that r ( M,G ) k,t ( R, R ) = E [ number of red members in red groups with size k ] kG k,t ( R ) (113) Corollary F.1.
We have that, lim t →∞ r ( M,G ) k,t ( R ) = G k ( R ) r ( M,G ) k ( R, R ) G k ( R ) r ( M,G ) k ( R, R ) + G k ( B ) r ( M,G ) k ( R, B ) . (114)(115) Lemma F.2.
For the red member ratios within groups with size 1, and within groups with size goesto infinity, we have: • For groups with size 1, lim t →∞ r ( M,G )1 ,t ( R ) = G ( R ) G ( R ) + G ( B ) = 1 + C B, + C B, C R, + C R, + C B, + C B, . (116)(117)31 For groups with size goes to infinity, assume C R, < C B, , lim k →∞ lim t →∞ r ( M,G ) k,t ( R ) = r ( M,G ) ( R ) , (118) where r ( M,G ) is defined as r ( M,G ) ( R ) = q RB q RB + q BB , (119) with q RB = rρ ( p ) R (cid:16) − (1 − ρ ( p ) B ) ξα ∗ − (1 − ρ ( u ) B )(1 − ξ ) r (cid:17) , (120) q BB = (1 − r ) (cid:16) − (1 − ρ ( p ) R ) ξ (1 − α ∗ ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) (cid:17) . (121) Proof.
Following Lemma F.1, we have that for r ( M,G )1 ,t ( R ) , since there is exactly 1 red (blue) memberin red (blue) group with size 1, so we have that r ( M,G )1 ,t ( R ) = G ,t ( R ) G ,t ( R ) + G ,t ( B ) → G ( R ) G ( R ) + G ( B ) . (122)For the case where k → ∞ , since we assume that there is a glass-ceiling effect against red members,as k → ∞ , we have that G k ( R ) /G k ( B ) → . That is, we only need focus on blue groups.As j → ∞ , it is easy to check that lim j →∞ p RB,j = r ( M,G ) , (123)and consequently we have that lim k →∞ r ( M,G ) k,R = r ( M,G ) , (124)which finishes the proof. G Proofs of Axillary Lemmas
G.1 Proof of Lemma E.5
Proof.
It is enough to show that, for any (cid:15) > , there exists T > , such that a t < (cid:15) for all t > T . First,since c t is summable, we can find T > , such that (cid:80) t>T c t < (cid:15)/ . Also, since lim t →∞ (cid:81) ti =1 b i = 0 ,we can find a T > T , such that (cid:81) t − i = T +1 b i · ( a + (cid:80) i> c i ) < (cid:15)/ for all t > T . We claim that T isthe desired T . Without the loss of generality, in the rest we denote c = a . By induction, it is nothard to have the following expression for a t a t = t − (cid:89) i =1 b i c + t − (cid:89) i =2 b i c + t − (cid:89) i =3 b i c + . . . + c t − = t − (cid:88) s =0 t − (cid:89) i = s +1 b i c s . (125)We can further decomposition the summation on the right hand side into two parts, according to s ≤ T and s > T . Now, for any t > T , for the first part, by our choice of T , and the fact that b i < , we have that T (cid:88) s =0 t − (cid:89) i = s +1 b i c s ≤ T (cid:88) s =0 t − (cid:89) i = T +1 b i c s = t − (cid:89) i = T +1 T (cid:88) s =0 c s < (cid:15)/ . (126)For the second part, by our choice of T and the fact that b i < , we simply have that t − (cid:88) s = T +1 t − (cid:89) i = s +1 b i c s ≤ t − (cid:88) s = T +1 c s < (cid:15)/ . (127)Combine the above two inequalities, with the fact that (cid:15) is arbitrary, we finish the proof.32 .2 Proof of Lemma E.4 Proof.
First, it enough to show (91), since if it holds, by the monotone convergence theorem, we have E (cid:34) lim T →∞ T (cid:88) t =1 | r ( E,M ) t ( R ) − r | + | r ( G ) t ( R ) − rη | + | r ( G ) t ( B ) − (1 − r ) η | t (cid:35) (128) = lim T →∞ T (cid:88) t =1 E [ | r ( E,M ) t ( R ) − r | + | r ( G ) t ( R ) − rη | + | r ( G ) t ( B ) − (1 − r ) η | ] t < ∞ , (129)which directly implies (90).We claim that, for a stochastic process { w t , t > } , in order to show that lim T →∞ (cid:80) Tt =1 E [ | w t | ] /t < ∞ , it is enough to have that, for some δ, c > , for any x > P ( | w t | > x ) ≤ exp (cid:16) − cx t δ (cid:17) . (130)It is because (130) implies that E [ w t ] = O ( t − δ/ ) , which makes E [ w t ] /t summable.By (77) and (72), we see that r ( E,M ) t ( R ) , r ( G ) t ( R ) satisfies the tail bound (130). Also r ( G ) t ( B ) satisfies, since it has the same behavior as r ( G ) t ( R ) . The proof is finished. G.3 Proof of Lemma E.2
Proof.
We define K ( x ) as ( F ( x ) − x ) (cid:16) − (1 − ρ ( p ) R ) ξ (1 − x ) − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) (cid:17) (cid:16) (1 − (1 − ρ ( p ) B ) ξx − (1 − ρ ( u ) B )(1 − ξ ) r (cid:17) . (131)By the definition of F ( x ) , it is easy to see that K ( x ) is a degree 3 polynomial, with a negative coefficientfor x term. Therefore, lim x →−∞ K ( x ) = −∞ and lim x →∞ K ( x ) = ∞ . Since a degree 3 polynomialat most have 3 real roots, if we have K (0) > and K (1) < , then obviously K ( x ) has exact one rootin (0 , . Moreover, for x ∈ [0 , , since ρ ( p ) R , ρ ( p ) B > K ( x ) /F ( x ) > (cid:16) − (1 − ρ ( p ) R ) ξ − (1 − ρ ( u ) R )(1 − ξ )(1 − r ) (cid:17) (cid:16) (1 − (1 − ρ ( p ) B ) ξ − (1 − ρ ( u ) B )(1 − ξ ) r (cid:17) (132) > (1 − ξ − (1 − ξ )(1 − r )) ((1 − ξ − (1 − ξ ) r ) (133) = (1 − ξ ) r (1 − ξ )(1 − r ) ≥ , (134)which implies that F ( x ) − x and K ( x ) share the same sign in (0 , . Hence if K (0) > and K (1) < ,we have that F ( x ) − x has exact one root α ∗ in (0 , . Moreover, for x ∈ (0 , , F ( x ) − x < if x > α ∗ , F ( x ) − x < if x > α ∗ . This implies that F ( x ) − F ( α ∗ ) < x − α ∗ if x > α ∗ , and F ( x ) − F ( α ∗ ) > x − α ∗ if x < α ∗ , which leads to the fact that for x ∈ [0 , < (cid:12)(cid:12)(cid:12)(cid:12) F ( x ) − F ( α ∗ ) x − α ∗ (cid:12)(cid:12)(cid:12)(cid:12) < . (135)One can check that | F (cid:48) ( α ∗ ) | < . Taking supreme over x in the above inequality, since | ( F ( x ) − F ( α ∗ )) / ( x − α ∗ ) | is a continuous function, the supreme is achieved at some point x . If x ! = α ∗ , wecan set γ = | ( F ( x ) − F ( α ∗ )) / ( x − α ∗ ) | < ; if x = α ∗ , we can set γ = | F (cid:48) ( α ∗ ) | <1