Networks with Growth and Preferential Attachment: Modeling and Applications
NNetworks with Growth and Preferential Attachment: Modeling and Applications
Gabriel G. Piva ∗ ,
1, 2, † Fabiano L. Ribeiro ‡ , § and Ang´elica S. Mata ¶ ∗∗ Departamento de F´ısica, Universidade Federal de Lavras, 37200-900 Lavras, MG, Brazil Departamento de Fisica, Pontif´ıcia Universidade Cat´olica do Rio de Janeiro, 22451-900 Rio de Janeiro, RJ, Brazil
In this article we presented a brief study of the main network models with growth and preferentialattachment. Such models are interesting because they present several characteristics of real systems.We started with the classical model proposed by Barab`asi and Albert [1]: nodes are added to thenetwork connecting preferably to other nodes that are more connected. We also presented modelsthat consider more representative elements from social perspectives, such as the homophily betweenthe vertices or the fitness that each node has to build connections [2, 3]. Furthermore, we showed aversion of these models including the Euclidean distance between the nodes as a preferential attach-ment rule [4]. Our objective is to investigate the basic properties of these networks as distribution ofconnectivity, degree correlation, shortest path, cluster coefficient and how these characteristics areaffected by the preferential attachment rules. Finally, we also provided a comparison of these syn-thetic networks with real ones. We found that characteristics as homophily, fitness and geographicdistance are significant preferential attachment rules to modeling real networks. These rules canchange the degree distribution form of these synthetic network models and make them more suitableto model real networks.
I. INTRODUCTION
Complex systems has become a widely applied areaof research because of everything around us can be de-scribed by a complex network, including social, techno-logical or biological organisms. The growth and the pref-erential attachment considering that a node has higherprobability to connect with a other node that alreadyhave many edges are famous ingredients [1] to produce apower law degree distribution, frequently used topologyto describe real systems.In general, it has been shown that real networkspresent a power law degree distribution with 2 < γ < that real networks, ruled by growth and pref-erential attachment, have power law with an exponentialcutoff degree distribution.In this paper, we investigated social and technological ∗ https://orcid.org/0000-0001-7636-6568 ‡ https://orcid.org/0000-0002-2719-6061 ¶ https://orcid.org/0000-0002-3892-5274 † Electronic address: [email protected] § Electronic address: fribeiro@ufla.br ∗∗ Electronic address: angelica.mata@ufla.br real networks and we found that they can be modeled bynetworks with growth and preferential attachment. Toaccount for more realistic aspects, we considered otherconcepts in the preferential attachment as fitness [2], ho-mophily [3], and Euclidean distance between nodes [4].Indeed, social systems often present these kind of fea-ture’s connections [16, 17] and real-world systems in gen-eral are often embedded in Euclidean space [18–21]. Weinvestigated the phone calls [11], collaboration [13] ande-mails networks [12]. The first two are social networksbecause they describe family, friendship and/or profes-sional interactions while the email network behaves as atechnological network. We also found that the email net-work present a more “scale-free” behavior in its degreedistribuiton while social networks are better describedby a q -exponential degree distribution, according to themodel proposed by Soares and collaborators [4].The paper is divided as follows: The detailed descrip-tion of networks models with growth and different rules ofpreferential attachments are found in section II, where wealso studied some properties of these networks as degreedistribution and assortativity. The main information andresults about the networks are summarized in table (I).In section III, we provided a comparison of these syn-thetic networks with real ones. At last, we presented ourfinal considerations in section IV. II. NETWORK MODELS WITH GROWTH ANDPREFERENTIAL ATTACHMENT
A network model has properties similar to real sys-tems. Networks are considered a powerful tools to rep-resent patterns of connections between parts of systemssuch as Internet, power grid, food webs, social networks,etc [8, 22, 23]. Some particular metrics properties, likedegree distribution, shortest path length, and clusteringcoefficient have been attracted attention of physics com- a r X i v : . [ phy s i c s . s o c - ph ] J u l munities.Watts and Strogatz [24] shows that real networks is char-acterized by average shortest path distance between twovertex and large clustering coefficient, describing theseproperties by small-world model.Based on that, Barab`asi-Albert [1] proposed two ba-sics mechanisms that try to better characterize a realnetwork: growth of system, adding new agents and pref-erential attachment, where a new agent connects prefer-entially with most connected nodes already on the net-work. The web expands with adding of new documentswhich links with older or well known sites, for instance.The probability that a new node will connect to a nodewith k links is proportional to k , independently of geo-graphic distance.However, there are other examples of real networkswhose connectivity may depend on the geographic dis-tance between the nodes, as a power grid. In addition togeographic distance, there may be other relevant ingre-dients to consider when connecting the elements of thesystem. Social interaction between people have intrinsiccharacteristics that should be taken into account as forexample the influence one person has on another and theaffinity between them, representing friendship, familiaror professional ties.To model these features, some networks have beenstudied through over the years. We presented below someof them that consider preferential attachment rules ac-cording to the degree (Barab`asi-Albert model [1]), or thefitness of the node to make connections [2], or the ho-mophiy between them [3], and finally, according to theeuclidean distance between the nodes [4]. A. Barab´asi-Albert Network
To explain in a simple way the behavior of technolog-ical networks, such as internet, Barab`asi and Albert [1]proposed the following model: • The system starts with m nodes connected to eachother. • At each time step, a new node j is entered on thenetwork and it connects to a random node i chosenat random with probability Π( k i | j ) proportional toits degree ( k i ), which meansΠ( k i | j ) = k i (cid:80) n k n (1)where the normalization (cid:80) n k n is the sum over all degree k n of each node n already connected on the network.These rules define what is know by Barab`asi-Albert(BA) model , and generate a network with a distributionof connectivity, say P ( k ), that follows a power-law degreedistribution of the form P ( k ) ∼ k − γ , with γ = 3 .
0, in the k -8 -6 -4 -2 P ( k ) Figure 1: Distribution of the connectivity degree P(k) of theBA network. Dots are the average over 10 networks of size N = 10 and m = 3. The dashed line has a slope P ( k ) ∼ k − and serves as a guide for the eyes. thermodynamic limit, which is independent of the valueof m , as shown in figure 1.We can also calculate the clustering coefficient, say (cid:104) C (cid:105) , of the BA network [25]. It is the tendency of thenetwork to form fully connected sub-graphs in the neigh-borhood of a given vertex, and grows with the networksize N as: (cid:104) C (cid:105) ∼ [ln( N )] N (2)We showed this behavior in figure 2. The simulation datafollow the same bias as given by equation 2. N -4 -3 -2 -1 < C > Figure 2: Clustering coefficient in function of the network sizefor BA network. The average was over 100 samples. The dotsin the dashed line represents the theoretical value calculatedfrom Eq. 2 and the dots in the continuous line is obtainedfrom simulations.
Other important measure of networks is called short-est path length. The distance between two any nodes i and j is defined as the number of links in the shortestpath that connects them, named d ij . The measure thatrepresents the average over all shortest paths that link allthe possible pairs of vertices in the network is called theaverage shortest path length (cid:104) d (cid:105) [25]. For BA network,it is given by (cid:104) d (cid:105) ∼ log N log(log N ) , confirming its small worldproperty [22].Other feature that should be analysed is the degreecorrelation. The nodes of a network can present a ten-dency to connect with other nodes that have a similar ordissimilar degree. When the first case happens one saysthe network is assortative correlated and if the secondcase occurs, the network is categorized as a disassorta-tive correlated [9].The simplest and most used way to quantify the degreecorrelation is given by the average degree of the nearestneighbors (nn) of a vertex i with degree k i [23], k nn,i = 1 k i (cid:88) j ∈N ( i ) k j , (3)where the sum runs over by the nearest neighbors verticesof i , represented by the set N ( i ). The degree correlationis obtained by the average degree of the nearest neigh-bors, k nn ( k ), for vertices of degree k [26]. That is, k nn ( k ) = 1 N k (cid:88) i | k i = k k nn,i , (4)where N k is the number of nodes of degree k and thesum runs over all vertices with the same degree k . Thisquantity is related to the correlations between the degreesof connected nodes because in average it can be expressedas k nn ( k ) = (cid:88) k (cid:48) k (cid:48) P ( k (cid:48) | k ) , (5)where P ( k (cid:48) | k ) is the probability of a node with degree k to have a neighbour node with degree k (cid:48) . If degrees ofneighboring vertices are uncorrelated, P ( k (cid:48) | k ) is just afunction of k (cid:48) and k nn ( k ) is a constant. If k nn increaseswith k then vertices with high degrees have a larger like-lihood of being connected to each other. If k nn decreaseswith k , high degree vertices have larger probabilities ofhave neighbors with low degrees [26, 27].The BA network is weakly disassortative as we showedin the figure 3. We observe that the preferential attach-ment interferes just in the connectivity of nodes recentlyadded in the network. According to the rule of the model,these nodes connect preferably with hubs, creating a dis-assortative correlation for small values of k . But, as longas the degree grows, the network becomes almost uncor-related.We also can use the Pearson coefficient, named c P ,to quantify degree correlations, according to the expres-sion [27]: c P = (cid:80) e j e k e /E − [ (cid:80) e ( j e + k e ) / (2 E )] [ (cid:80) e ( j e + k e ) / (2 E )] − [ (cid:80) e ( j e + k e ) / (2 E )] , (6) k k nn ( k ) Figure 3: Degree correlation measured through the nearest-neighbors degree. It was used a BA network with size N = 10 , and averaged over 10 samples. The preferentialattachment rule of the BA networks affects just the connec-tivity of nodes recently added in the network, the ones withsmall k . They connect primarily with hubs, creating a dis-assortative correlation for small values of k . As long as thedegree grows, the network becomes almost uncorrelated. where j e and k e are the degrees of the nodes that are inthe beginning and in the end of the edge e , and E is thetotal number of connections. This quantity ranges from − k nn ( k ) measure. While the latter provides how the de-gree correlation can vary with k , the Pearson coefficient( c P ) quantifies the degree correlation of the entire net-work according to a scale ranging from -1 to 1. This mea-sure was also used to complement the characterization ofa topological phase transition on growth and preferentialattachment model that consider the euclidean distancebetween the nodes, as we will see in section II D. In addi-tion, it will be useful to compare the synthetic networkswith real ones, in section III.All the main information of the BA network is summa-rized in the table I, as well the information about othernetworks that were also treated in this paper. In gen-eral, real networks present a power law degree distribu-tion with 2 < γ < γ ≈
3. Next, we show other features thatcan be added to the model to make it more realistic.
B. Fitness Model: Bianconi-Barab´asi Network
The original BA model produces a power-law networkwith the presence of sites that become privileged that is,with more connections over time. But this model doesnot taking into account the competitiveness, this means,the ability of younger nodes to acquire new neighbors.Facebook, for example, has become one of the most vis-ited sites in a short period of time when compared to theGoogle, an older search website. Another example is thegrowth of corporations where some newer ones concen-
Table I: Table with all main informations of the networks that were investigated in this work. The mean clustering coefficient (cid:104) C (cid:105) and the Pearson correlation coefficient c P are obtained for a sample of 1000 networks with size N = 10 . The averageshortest path length (cid:104) d (cid:105) is obtained for a sample of at least 20 networks with the same size. For networks with Euclideandistance we consider always α A = 3, since the topological phase transition occurs for α A ≈ . The preferential attachment ruleis shown in the column (cid:81) ( k i | j ). P(k) is the degree distribution form of each model, and γ is the exponent related to a powerlaw degree distribution that characterizes the first three networks that were investigated.Network Π( k i | j ) P ( k ) γ (cid:104) C (cid:105) (cid:104) d (cid:105) c p Barab´asi-Albert k i (cid:80) n k n ∼ k − γ η i k i (cid:80) n k n ∼ k − γ (1 −A ij ) k i (cid:80) n (1 −A in ) k n ∼ k − γ k i r − αAij (cid:80) n k i r − αAin ∼ e − k/κq - 0.0019(2) 4.7(1) 0.034(7)Fitness Model η i k i r − αAij (cid:80) n η n k i r − αAin e − k/κq - 0.0034(4) 4.6(1) -0.046(8)with euclidean distanceHomophilic Model (1 −A ij ) k i r − αAij (cid:80) n (1 −A in ) k i r − αAin e − k/κq - 0.0020(2) 4.7(1) 0.028(7)with euclidean distance trate more services than older ones. We can simulate thissituation including a intrinsic characteristic in each node,called fitness. In social networks, fitness would representan individual’s attribute of becoming more popular dueto some quality of him/her. In networks, it represent theprobability of a node to become a hub quickly.This characteristic was observed in real networks byBianconi and Barab`asi, who later proposed an alterna-tive model including the fitness factor η i of each node i [2]. The algorithm is similar to the BA network, buteach site connects to an existing node on the networkwith a probability that, in addition to depending on k connectivity, is also proportional to the attractiveness η i . The choice of η i ∈ [0 ,
1] is usually given by a uniformdistribution ρ ( η i ) [2], and the connection probability isdefined by: Π( k i | j ) = η i k i (cid:80) n η n k n . (7)When the fitness parameter is imposed, the networkremains a power-law degree distribution but with an ex-ponent less than 3 (see figure 4). According to the liter-ature, γ = 2 .
25 in the thermodynamic limit. There aremore privileged sites and, consequently, more hubs thanthe BA network, which makes the γ exponent smaller,that is, the network is more heterogeneous. In the inset offigure 4 we show the degree correlation measured throughthe nearest-neighbors degree for the Fitness model. Thebehavior is similar to that one we found for BA net-work. We also calculate the mean clustering coefficient (cid:104) C (cid:105) , the average shortest path length (cid:104) d (cid:105) , and the Pear-son correlation coefficient c P . These informations areshown in table I. We observed that, when compared to BA model, this network presents a higher cluster coeffi-cient and a lower Pearson correlation coefficient, but theaverage shortest path length pretty does not change. k k nn ( k ) k -6 -4 -2 P ( k ) Figure 4: Distribution of connectivity for Bianconi-Barab`asimodel with m = 3 , N = 10 based on 1000 network realiza-tions. The graph is on the log-log scale. The dashed line, withslope P ( k ) ∼ k − . , is a guide for the eyes. This network hasmore privileged sites and, consequently, more hubs than theBA network, which explains its smaller value of γ . Inset:Degree correlation measured through the nearest-neighborsdegree for the same set of networks. The behavior is similarto that one we found for BA network. C. Homophilic Model
In a social network, people tend to relate to otherswho share common characteristics such as musical taste,football team, religion, and work. To take this tendencyinto account in social network models, we can include aconnection parameter called homophily.Almeida and colleagues [3] proposed a model introduc-ing this parameter through an intrinsic property value ofeach node, called η i , similar to the previous model. Thehomophily between any two nodes i and j , say A ij , isdefined as the module of the difference between η i and η j , that is, A ij = | η i − η j | . The lower is A ij , the greaterthe affinity between both and, consequently, the greaterthe probability to connect with each other. The proposedalgorithm is as follows: • It starts with m sites connected to each other, inthe same way as in the BA network, but introduc-ing a characteristic η i for each node, chosen ran-domly in a uniform distribution p ( η ) in the interval[0 , • For each time step, add a node j that links to other m nodes already on the network. Each j nodeconnects preferably to a node i according to theprobabilityΠ( k i | j ) = (1 − A ij ) k i (cid:80) n (1 − A in ) k n (8) • The procedure of the second item is repeated untilthe network reaches a previously established size N . k k nn ( k ) k -6 -4 -2 P ( k ) Figure 5: Distribution of connectivity for Homophilic modelnetwork. It was used networks with size N = 10 and m = 3,based on 1000 network realizations. The dashed line has slope P ( k ) ∼ k − . as a guide for the eyes. Its γ = 2 .
75 is smallerthan the γ = 3 . γ = 2 . The competition between the degree of connectivityand the affinity between the nodes generates a networkwith a power law degree distribution, but with γ ∼ .
75, as we shown in figure 5. This value is lesser than theexponent obtained in the BA model ( γ = 3 .
0) but itis greater than the value obtained in the Fitness net-work ( γ = 2 . j can assumea value of η j = 0 .
5, for example. When this happens, if ittries to connect to a node i that has η i = 0 . k which has η k = 0 .
7, the affinities between bothpairs ij and jk are the same. So, in this case, accordingto the expression (8) who dictates the preference in theconnection is the degree of the node [3].In the inset of figure 5 we show the degree correla-tion measured through the nearest-neighbors degree forthe Homophilic model. The behavior is similar to thatone we found for the other networks. We also calculatethe mean clustering coefficient (cid:104) C (cid:105) , the average shortestpath length (cid:104) d (cid:105) , and the Pearson correlation coefficient c P . These informations are shown in table I. We ob-served that, when compared to BA model, this networkpresents almost the same (cid:104) d (cid:105) and c P but a slightly higherclustering coefficient. D. Networks with Euclidean distance
The models presented above do not take into accountthe spatial distance between the agents that compose thenetwork. But in many real systems, that variable playsan important role. For example, in the city model pro-posed by Ribeiro and colaborators [28], the authors ob-served how the Euclidean distance influences the poten-tial of cities and in scale’s law to measure socio-economicand infrastructure indicators. There are other works thatshowed the relation between social interaction and spa-tial properties [20, 21, 29]. For example, in the paper[20], the authors analyzed online social networks andthey found that spatial distance restricts who interactswith whom and denser connected groups tend to ariseat shorter spatial distances. In the following subsectionswe reconstructed the standard models shown previouslyincluding euclidean distance between the nodes as an at-tachment ingredient.
1. Model proposed by Soares et. al
Soares and colleagues [4] built a model in which thepreferred connection dynamics happens according to thedegree of connectivity but also considers the Euclideandistance between the nodes. To build the model, we con-sider that each site is inserted on a continuous plane. -4 -3 -2 -1 0 1 20123
Figure 6: An example of a network with size N = 20 gener-ated according to the algorithm proposed by Soares et. al. [4]using α A = 2 and α G = 2. Note that the nearest links aremore likely but long-range connections can also happen. The first node is added at an arbitrary distance from theorigin and the others are isotropically distributed witha probability P G ( r ) ∝ r − (2+ α G ) , which depends on thedistance r from the center of mass, which is positionedat r cm from the origin and is re-calculate at each timestep. The exponent α G ( G refers to the word growth) isresponsible for the network growth, that is, defines howclose or distant the nodes will be placed. The calcu-lation of the position of the center of mass is given by r cm = M (cid:80) Nn =1 m n r n where m n is the mass of the node n , and r n is the vector-distance of this node to the origin,and M = (cid:80) Nn =1 m n is the total mass. The network hasa total of N nodes, and we can consider each node withmass m n = 1, so we have r cm = N (cid:80) Nn =1 r n . Each newsite j connects to a pre-existing node i following a ruleof preferential connection that depends on the distancebetween them, r ij and the degree of connectivity of thenode i , that is, Π( k i | j ) = k i r − α A ij (cid:80) n k n r − α A in . (9)The α A exponent ( A refers to the word attachment)controls the influence of spatial distance between sitesin the preferential attachment. If α A = 0, we recoverthe BA network that does not take into account the spa-tial distance between the nodes. This model preservesthe preferential attachment according to the degree ofthe nodes, but also taking into account the geographi-cal distance as a criterion to dispute for links. In figure6 we show a plot of an example of a network generatedaccording to this algorithm.Numerical results show that the parameter α G doesnot affect the behavior of the connectivity distribution P ( k ) of the network (see Figure 7(b)). This parameterrefers just to the distance distribution in relation to thecenter of mass, and acts only on size scale but not onthe structure of the network, and consequently it does not impact on the preferential attachment rules. On theother hand, as α A increases, the connectivity distributionchanges (see figure 7(a)). Soares et. al. [4] showed thatthe degree distributions of networks generated accordingto their model are very well fitted with the form P ( k ) = P (0) e − k/k q (10)where k > P (0) is a constant to be normalized, q is the entropicindex and e xq is the q - exponential defined by e xq ≡ [1 + (1 − q ) x ] / (1 − q ) , (11)where the natural exponential function is a particularcase: e x = e xq =1 .The authors [4] showed that both k and q are func-tions of α A . So, as α A increases, a topological phasetransition occurs in the connectivity distribution [4, 30].The network changes from a completely heterogeneousnetwork ( α A = 0) to an increasingly homogeneous net-work as α A tends to infinity. Such phase transition alsoappears in the degree correlation of the nodes, as we showin the calculation of k nn ( k ) for different values of α A (seefigure 8(a)). The transition is clearer in the graph 8(b)in which we show the calculation of Pearson’s coefficientas a function of α a . Close to α a = 2, Pearson’s coeffi-cient changes from a negative value, which characterizesa disassociative network, to a positive value, which char-acterizes an associative network.Finally, other two more evidence that the topologicalphase transiton can be discussed. We can measure thelevel of heterogeneity of a network using the quantity κ = (cid:104) k (cid:105) / (cid:104) k (cid:105) , where (cid:104) k p (cid:105) is the p − th moment of the de-gree distribution. If κ/ (cid:104) k (cid:105) > N → ∞ whilefor homogeneous networks κ/ (cid:104) k (cid:105) ≈ α A increases because κ/ (cid:104) k (cid:105) decreases and approaches toone. We also calculated the average shortest path length.When the network becomes more homogeneous, the av-erage shortest path length increases because the numberof hubs decreases and consequently the path between thenodes increases. This measure, also shown in figure 9,reinforces the topological phase transition.
2. Variations of the model proposed by Soares et. al.
It is possible to include euclidian distance in the ho-mophilic and the fitness models, as investigated by Nunesand collaborators [30]. For example, when we study thesocial interaciton in a city [28], the parameter η i can rep-resent the influence of different places localized in thecity. So it is possible to use the fitness model with Eu-clidean distance to try to explain, for example, why someplaces in a city is more attractiveness than others to open k -5 -4 -3 -2 -1 P ( k ) α a = 0 α a = 1 α a = 2 α a = 3 α a = 4 α a = 5 k -6 -4 -2 P ( k ) α G = 0 α G = 1 α G = 2 α G = 3 α a = 4 α a = 5 (a) (b)Figure 7: (a) Distribution of connectivity for different values of α A and a fixed α G = 2. By varying α A , a topological transitionappears close to α A = 2. For α A = 0, the BA network is recovered, meaning that spatial distance between nodes is not takeninto account. The network changes from a completely heterogeneous network ( α A = 0) to an increasingly homogeneous networkas α A tends to infinity. (b) Distribution of connectivity for different values of α G and a fixed α A = 2 (right). By varying α G the distribution does not change significantly. Results obtained with networks of size N = 10 , averaged over 1000 samples. k k nn α a = 0 α a = 1 α a = 2 α a = 3 α a = 4 α a -0,0500,050,10,150,2 P ea r s on ’ s c o e ff i c i e n t (a) (b)Figure 8: Degree correlation for a network of size N = 10 and different values of α A . In figure (a) we show the measure of k nn in function of k . Highlighted the change from a weakly disassociative regime to an associative one. In figure (b),the Pearson’scoefficient as a function of α A . Close to α a = 2, Pearson’s coefficient changes from a negative value, which characterizes adisassociative network, to a positive value, which characterizes an associative network. In both (a) and (b) the average wasmade over 1000 samples. a store, coffee shop or gas station. We also can use thehomophilic model including spatial distance to study theinfluence of the topology in a formation of neighborhoods,since people tend to cluster with people that have a sim-ilar social class, religion or workplace [17, 31, 32].The algorithms used to construct both models werealready shown in previous sections. Now, we just haveto include the metric, using the function P G ∼ r − ( α G +2) to distribute the nodes in a continuous plane and changethe preferential attachment rules that become,Π( k i | j ) = η i k i r − α A ij (cid:80) n η n k n r − α A in and (12) Π( k i | j ) = (1 − A ij ) k i r − α A ij (cid:80) n (1 − A in ) k n r − α A in , (13)for fitness and homophilic models, respectively.Nunes [30] also shown a topological phase transi-tion, as α A increases for fitness model and no influenceof the parameter α G in the pattern of the connectivitydistribution. We obtained the same results for ho-mophilic networks. The data are not shown becausethey are very similar to the results shown in figure 7. α a Shorthest Average Path κ/
Most of social networks are assortative while techno-logical ones tend to be more disassociative [6, 9]. Tosupport this evidence and show that the models studiedin this article are successful in modeling real systems,we have chosen three real networks to investigated twodistinct properties: degree distribution and assortativity.The networks are: • Phone Calls: nodes represent cell phone users andthe edges exist if they have called each other atleast once during the investigated period. Data arefrom [11]. • Collaboration network: each node represents an au-thor in a scientific collaboration and the edges be-tween them represent a co-authored at least onepaper in the period from January 1993 to April2003. The data are obtained from arXiv preprintCondense Matter Physics [13]. • Email: nodes are email adress and a directed linkfrom one node to another represent at least oneemail sent. The data are collected during 112 daysin the University of Kiel (Germany) [12].According to the table II, the Pearson’s coefficient ofphone calls and collaboration networks are positive whilefor email network this coefficient is negative. The firsttwo are social networks and they are basically related tofamily/friendship and professional interactions, respec-tively. In reference [9], Newman found similar results for biology and mathematics coauthorship. However, theemail network, although it also describes some social in-teraction, behaves more as a technological network. Inthe reference cited above, the author also found similarvalue for World-Wide-Web.
Table II: Size N and Pearson’s correlation coefficient r fordifferent real networks. We compared the values with thePearson’s coefficient calculated for synthetic networks withthe same size. For the phone calls network, we used the ho-mophilic network including euclidean distance ( α A = 5). Forthe collaboration network, we used the BA network includ-ing euclidean distance ( α A = 5) and finally for the emailnetwork, we used the fitness network including euclidean dis-tance ( α A = 1).Real Network N c P c P (synthetic network)Phone Calls 36594 0.282 0.120Collaboration 23132 0.134 0.112Email 57194 -0.075 - 0.078 Now, we can compare this real systems with our in-vestigated models. In the case of phone calls network,we used the homophilic model and we investigated howthe euclidean distance between the nodes of the systemaffects the network’s degree distribution. Homophilicmodel was chosen because it is reasonable to assume thattelephone calls happen between people who have a cer-tain affinity with each other, whether for personal, familyor professional reasons. This hypothesis is corroboratedin recent works [16, 17].The Pearson correlation coefficient of the investigatedsynthetic network is not very similar to the value ob-tained for the real network (see table II). However weappreciated how accurate the degree distribution of thissynthetic network is when compared to the real one. Asshown in figure 10, when we used the attachment pa-rameter α A = 5, the fits works extremely well, empha-sizing the importance of considering geographic distancebetween the elements of the system when modeling realsocial networks. Indeed, a lot of work have followed thisline [20, 21, 29].The same analysis can be done for the collaborationnetwork. The Pearson correlation coefficient of the syn-thetic network is similar to the one calculated for thereal system. In this study, the only change was the syn-thetic network investigated. We chosed the traditionalBarab`asi-Albert model but also including the Euclideandistance and, as we showed in figure 11, the fit using α A = 5 is accurate as well. In networks of scientific co-authorship more distinguished researchers, such as uni-versity professors, tend to publish works with less famousresearchers, such as their graduate students. This sup-port both assumptions: the BA preferential attachmentrule according to the degree of the node and the influ-ence of the distance between the elements of the system.For the collaboration network, the fitness and homophilicmodels also showed reasonable results. As long as we in- -5 -4 -3 -2 -1 P ( k ) k10 -5 -4 -3 -2 -1 P ( k ) k10 k10 α A = 0 α A = 1 α A = 2 α A = 3 α A = 4 α A = 5 Figure 10: Degree distribution of a phone calls network com-pared with the distinct degree distributions of synthetic net-works with the same size generated according to the ho-mophilic model including Euclidean distance. Black pointsrepresent real network data and solid colored lines are relatedto synthetic networks. The synthetic network with α A = 5fits better the data. creased the value of α A , the preferential connection ruleinvolving the Euclidean distance prevails in relation tothe others. -5 -4 -3 -2 -1 P ( k ) k10 -5 -4 -3 -2 -1 P ( k ) k10 k10 α A = 0 α A = 1 α A = 2 α A = 3 α A = 4 α A = 5 Figure 11: Degree distribution of a collaboration networkcompared with the distinct degree distributions of syntheticnetworks with the same size generated according to theBarab`asi-Albert model including Euclidean distance. We ob-serve that the last scenario ( α A = 5) fits better the real data.Black points represent real network data and solid coloredlines are related to synthetic networks. Finally, the email network presents a very similar Pear-son correlation coefficient with compared to the fitnesssynthetic network considering the euclidean distante.But here the parameter α A = 1 fits better the degreedistribution of real data. It shows a smaller impact ofthe geographical distance of the nodes in technologicalnetworks than in social ones. This can also be related to the fact that this real network has directed links. As thisemail network was obtained from a university, the fitnessmodel was chosen based on the fact that, in academia,students tend to send more emails to teachers than theotherwise. So the message sent depends on how influen-tial (greater fitness) the reciever is. -6 -4 -2 P ( k ) -6 k10 -6 -4 -2 P ( k ) k10 k10 -6 α A = 0 α A = 1 α A = 2 α A = 3 α A = 4 α A = 5 Figure 12: Degree distribution of an email network comparedwith the distinct degree distributions of synthetic networkswith the same size generated according to the fitness modelincluding Euclidean distance. We observe that the secondscenario ( α A = 1) fits better the real data. Black pointsrepresent real network data and solid colored lines are relatedto synthetic networks. IV. CONCLUSIONS
In this work we have studied network models withgrowth and different rules of preferential attachment.We reviewed some important algorithms such as theBarabasi-Albert model and others that includes fitness,homophily and/or Euclidean distance as strategies tomake connections between nodes. From an applicableperspective, these models are useful to model real-worldnetworks because they present characteristics found insocial sytems and also in technological ones, as we showedin the last section.Our results corroborated with evidences that power-law degree structured is not very common in real systems.We evaluated two social and one technological networkand we compared the degree distribution of these net-works to degree distributions generated by growth andpreferential attachment models. Our main conclusion isthat the real networks analysed are better fitted withmodels which consider traits as fitness, homophily andeuclidean distance between nodes. We observed that ge-ographic distance between nodes seems to be an impor-tant factor to model specially real social systems. Thisfeature changes the form of the degree distribution ofa power law to a q-exponential according to the modelproposed by Soares and collaborators [4]. Our results0are in agreement with recent studies involving real net-works [7, 11–15, 20, 21, 29, 33].We also supplemented the characterization of thesesynthetic networks investigating measures as clustering,average shortest path length, degree distribution and as-sortativity.Finally, it is important to mention that many dynam-ical processes as epidemics, rumor propagation and syn-chronization were extensively investigated in scale-freeto- pologies as the Barab`asi-Albert network. Howeverthe study of these dynamics in substrates where the dis-tance between the elements of the system is taken intoaccount needs to further advance, since this element hasalready been shown to be very important. Even on onlinesocial networks [16, 17, 19–21, 29], it seems to influence the connection between the nodes, as well as fitness andhomophily.
Acknowledgments
This work was partially supported by the Brazilianagencies CAPES, CNPq and FAPEMIG. The authors ac-knowledges computational time at DFI-UFLA. Ang´elicaS. Mata thanks the support from FAPEMIG (Grant No.APQ-02482-18) and CNPq (Grant No. 423185/2018-7). Fabiano L. Ribeiro thanks the support from CNPq(405921/2016-0) and CAPES (88881.119533/2016-01). [1] A.-L. Barab´asi and R. Albert, science , 509 (1999).[2] G. Bianconi and A.-L. Barab´asi, EPL (Europhysics Let-ters) , 436 (2001).[3] de Almeida Maur´ıcio L. et. al. , The European PhysicalJournal B-Condensed Matter and Complex Systems ,38 (2013).[4] D. J. Soares, C. Tsallis, A. M. Mariz, and L. R. da Silva,EPL (Europhysics Letters) , 70 (2005).[5] F. Radicchi, Nature Physics , 597 (2015).[6] M. E. J. Newman and J. Park, Phys. Rev. E ,036122 (2003), URL https://link.aps.org/doi/10.1103/PhysRevE.68.036122 .[7] C. Radicchi, Filippo and Claudio, Nature Communica-tions , 10196 (2015), URL https://doi.org/10.1038/ncomms10196 .[8] G. Caldarelli, Scale-free networks: complex webs in na-ture and technology (Oxford University Press, 2007).[9] M. E. J. Newman, Phys. Rev. Lett. , 208701(2002), URL https://link.aps.org/doi/10.1103/PhysRevLett.89.208701 .[10] A. Barab´asi and M. P˜asfai, Network Science (Cam-bridge University Press, 2016), ISBN 9781107076266,URL https://books.google.com.br/books?id=iLtGDQAAQBAJ .[11] C. Song, Z. Qu, N. Blumm, and A.-L. Barab´asi,Science , 1018 (2010), ISSN 0036-8075,https://science.sciencemag.org/content/327/5968/1018.full.pdf,URL https://science.sciencemag.org/content/327/5968/1018 .[12] H. Ebel, L.-I. Mielsch, and S. Bornholdt, Phys. Rev.E , 035103 (2002), URL https://link.aps.org/doi/10.1103/PhysRevE.66.035103 .[13] J. Leskovec, J. Kleinberg, and C. Faloutsos, ACM Trans.Knowl. Discov. Data , 2–es (2007), ISSN 1556-4681,URL https://doi.org/10.1145/1217299.1217301 .[14] A. D. Broido and A. Clauset, Nature communications , 1 (2019).[15] P. Holme, Nature communications , 1 (2019).[16] S. Currarini, J. Matheson, and F. Vega-Redondo, Euro-pean Economic Review , 18 (2016).[17] H. Bisgin, N. Agarwal, and X. Xu, in Web Intelli-gence and Intelligent Agent Technology (WI-IAT), 2010IEEE/WIC/ACM International Conference on (IEEE, 2010), vol. 1, pp. 533–536.[18] A. F. Rozenfeld, R. Cohen, D. ben Avraham, andS. Havlin, Phys. Rev. Lett. , 218701 (2002), URL https://link.aps.org/doi/10.1103/PhysRevLett.89.218701 .[19] B. A. C. H. L. W. Y. Q. X. L. X. Liu, L.; Chen, ISPRSInt. J. Geo-Inf (2018).[20] V. Y. S. S. M. C. Laniado, D. and A. Kaltenbrun-ner, Information Systems Frontiers (2017), URL https://doi.org/10.1007/s10796-017-9784-9 .[21] B. Lengyel, A. Varga, B. S´agv´ari, ´A. Jakobi, andJ. Kert´esz, PLoS ONE (2015).[22] B. Bollob´as and O. M. Riordan, Handbook of graphsand networks: from the genome to the internet pp. 1–34 (2003).[23] S. N. Dorogovtsev and J. F. Mendes, Advances in physics , 1079 (2002).[24] D. J. Watts and S. H. Strogatz, Nature , 440 (1998).[25] S. N. Dorogovtsev and J. F. F. Mendes, CoRR cond-mat/0404593 (2004), URL http://arxiv.org/abs/cond-mat/0404593 .[26] R. Pastor-Satorras, A. V´azquez, and A. Vespignani,Physical review letters , 258701 (2001).[27] A. Barrat, M. Barthelemy, and A. Vespignani, Dynami-cal processes on complex networks (Cambridge universitypress, 2008).[28] F. L. Ribeiro, J. Meirelles, F. F. Ferreira, and C. R. Neto,Royal Society open science , 160926 (2017).[29] X. Liu, Y. Xu, and X. Ye, Outlook and Next Steps: In-tegrating Social Network and Spatial Analyses for Ur-ban Research in the New Data Environment (SpringerInternational Publishing, Cham, 2019), pp. 227–238,ISBN 978-3-319-95351-9, URL https://doi.org/10.1007/978-3-319-95351-9_13 .[30] T. C. Nunes, S. Brito, L. R. da Silva, and C. Tsallis,Journal of Statistical Mechanics: Theory and Experi-ment , 093402 (2017).[31] Vinicius M. Netto, Joao Meirelles and F. L. Ribeiro,Complexity (2017).[32] V. M. Netto, E. Brigatti, J. Meirelles, F. L. Ribeiro,B. Pace, C. Cacholas, and P. Sanches, Entropy , 1(2018), ISSN 10994300.[33] F. F. L. Ribeiro, Revista Brasileira de Ensino de Fisica39