[PDF] Networks with Growth and Preferential Attachment: Modeling and Applications

Abstract

In this article we presented a brief study of the main network models with growth and preferential attachment. Such models are interesting because they present several characteristics of real systems. We started with the classical model proposed by Barabasi and Albert: nodes are added to the network connecting preferably to other nodes that are more connected. We also presented models that consider more representative elements from social perspectives, such as the homophily between the vertices or the fitness that each node has to build connections. Furthermore, we showed a version of these models including the Euclidean distance between the nodes as a preferential attachment rule. Our objective is to investigate the basic properties of these networks as distribution of connectivity, degree correlation, shortest path, cluster coefficient and how these characteristics are affected by the preferential attachment rules. Finally, we also provided a comparison of these synthetic networks with real ones. We found that characteristics as homophily, fitness and geographic distance are significant preferential attachment rules to modeling real networks. These rules can change the degree distribution form of these synthetic network models and make them more suitable to model real networks.

Full PDF

NNetworks with Growth and Preferential Attachment: Modeling and Applications

Gabriel G. Piva ∗ ,

1, 2, † Fabiano L. Ribeiro ‡ , § and Ang´elica S. Mata ¶ ∗∗ Departamento de F´ısica, Universidade Federal de Lavras, 37200-900 Lavras, MG, Brazil Departamento de Fisica, Pontif´ıcia Universidade Cat´olica do Rio de Janeiro, 22451-900 Rio de Janeiro, RJ, Brazil

In this article we presented a brief study of the main network models with growth and preferentialattachment. Such models are interesting because they present several characteristics of real systems.We started with the classical model proposed by Barab`asi and Albert [1]: nodes are added to thenetwork connecting preferably to other nodes that are more connected. We also presented modelsthat consider more representative elements from social perspectives, such as the homophily betweenthe vertices or the ﬁtness that each node has to build connections [2, 3]. Furthermore, we showed aversion of these models including the Euclidean distance between the nodes as a preferential attach-ment rule [4]. Our objective is to investigate the basic properties of these networks as distribution ofconnectivity, degree correlation, shortest path, cluster coeﬃcient and how these characteristics areaﬀected by the preferential attachment rules. Finally, we also provided a comparison of these syn-thetic networks with real ones. We found that characteristics as homophily, ﬁtness and geographicdistance are signiﬁcant preferential attachment rules to modeling real networks. These rules canchange the degree distribution form of these synthetic network models and make them more suitableto model real networks.

I. INTRODUCTION

Complex systems has become a widely applied areaof research because of everything around us can be de-scribed by a complex network, including social, techno-logical or biological organisms. The growth and the pref-erential attachment considering that a node has higherprobability to connect with a other node that alreadyhave many edges are famous ingredients [1] to produce apower law degree distribution, frequently used topologyto describe real systems.In general, it has been shown that real networkspresent a power law degree distribution with 2 < γ < that real networks, ruled by growth and pref-erential attachment, have power law with an exponentialcutoﬀ degree distribution.In this paper, we investigated social and technological ∗ https://orcid.org/0000-0001-7636-6568 ‡ https://orcid.org/0000-0002-2719-6061 ¶ https://orcid.org/0000-0002-3892-5274 † Electronic address: [email protected] § Electronic address: fribeiro@uﬂa.br ∗∗ Electronic address: angelica.mata@uﬂa.br real networks and we found that they can be modeled bynetworks with growth and preferential attachment. Toaccount for more realistic aspects, we considered otherconcepts in the preferential attachment as ﬁtness [2], ho-mophily [3], and Euclidean distance between nodes [4].Indeed, social systems often present these kind of fea-ture’s connections [16, 17] and real-world systems in gen-eral are often embedded in Euclidean space [18–21]. Weinvestigated the phone calls [11], collaboration [13] ande-mails networks [12]. The ﬁrst two are social networksbecause they describe family, friendship and/or profes-sional interactions while the email network behaves as atechnological network. We also found that the email net-work present a more “scale-free” behavior in its degreedistribuiton while social networks are better describedby a q -exponential degree distribution, according to themodel proposed by Soares and collaborators [4].The paper is divided as follows: The detailed descrip-tion of networks models with growth and diﬀerent rules ofpreferential attachments are found in section II, where wealso studied some properties of these networks as degreedistribution and assortativity. The main information andresults about the networks are summarized in table (I).In section III, we provided a comparison of these syn-thetic networks with real ones. At last, we presented ourﬁnal considerations in section IV. II. NETWORK MODELS WITH GROWTH ANDPREFERENTIAL ATTACHMENT

A network model has properties similar to real sys-tems. Networks are considered a powerful tools to rep-resent patterns of connections between parts of systemssuch as Internet, power grid, food webs, social networks,etc [8, 22, 23]. Some particular metrics properties, likedegree distribution, shortest path length, and clusteringcoeﬃcient have been attracted attention of physics com- a r X i v : . [ phy s i c s . s o c - ph ] J u l munities.Watts and Strogatz [24] shows that real networks is char-acterized by average shortest path distance between twovertex and large clustering coeﬃcient, describing theseproperties by small-world model.Based on that, Barab`asi-Albert [1] proposed two ba-sics mechanisms that try to better characterize a realnetwork: growth of system, adding new agents and pref-erential attachment, where a new agent connects prefer-entially with most connected nodes already on the net-work. The web expands with adding of new documentswhich links with older or well known sites, for instance.The probability that a new node will connect to a nodewith k links is proportional to k , independently of geo-graphic distance.However, there are other examples of real networkswhose connectivity may depend on the geographic dis-tance between the nodes, as a power grid. In addition togeographic distance, there may be other relevant ingre-dients to consider when connecting the elements of thesystem. Social interaction between people have intrinsiccharacteristics that should be taken into account as forexample the inﬂuence one person has on another and theaﬃnity between them, representing friendship, familiaror professional ties.To model these features, some networks have beenstudied through over the years. We presented below someof them that consider preferential attachment rules ac-cording to the degree (Barab`asi-Albert model [1]), or theﬁtness of the node to make connections [2], or the ho-mophiy between them [3], and ﬁnally, according to theeuclidean distance between the nodes [4]. A. Barab´asi-Albert Network

To explain in a simple way the behavior of technolog-ical networks, such as internet, Barab`asi and Albert [1]proposed the following model: • The system starts with m nodes connected to eachother. • At each time step, a new node j is entered on thenetwork and it connects to a random node i chosenat random with probability Π( k i | j ) proportional toits degree ( k i ), which meansΠ( k i | j ) = k i (cid:80) n k n (1)where the normalization (cid:80) n k n is the sum over all degree k n of each node n already connected on the network.These rules deﬁne what is know by Barab`asi-Albert(BA) model , and generate a network with a distributionof connectivity, say P ( k ), that follows a power-law degreedistribution of the form P ( k ) ∼ k − γ , with γ = 3 .

0, in the k -8 -6 -4 -2 P ( k ) Figure 1: Distribution of the connectivity degree P(k) of theBA network. Dots are the average over 10 networks of size N = 10 and m = 3. The dashed line has a slope P ( k ) ∼ k − and serves as a guide for the eyes. thermodynamic limit, which is independent of the valueof m , as shown in ﬁgure 1.We can also calculate the clustering coeﬃcient, say (cid:104) C (cid:105) , of the BA network [25]. It is the tendency of thenetwork to form fully connected sub-graphs in the neigh-borhood of a given vertex, and grows with the networksize N as: (cid:104) C (cid:105) ∼ [ln( N )] N (2)We showed this behavior in ﬁgure 2. The simulation datafollow the same bias as given by equation 2. N -4 -3 -2 -1 < C > Figure 2: Clustering coeﬃcient in function of the network sizefor BA network. The average was over 100 samples. The dotsin the dashed line represents the theoretical value calculatedfrom Eq. 2 and the dots in the continuous line is obtainedfrom simulations.

Other important measure of networks is called short-est path length. The distance between two any nodes i and j is deﬁned as the number of links in the shortestpath that connects them, named d ij . The measure thatrepresents the average over all shortest paths that link allthe possible pairs of vertices in the network is called theaverage shortest path length (cid:104) d (cid:105) [25]. For BA network,it is given by (cid:104) d (cid:105) ∼ log N log(log N ) , conﬁrming its small worldproperty [22].Other feature that should be analysed is the degreecorrelation. The nodes of a network can present a ten-dency to connect with other nodes that have a similar ordissimilar degree. When the ﬁrst case happens one saysthe network is assortative correlated and if the secondcase occurs, the network is categorized as a disassorta-tive correlated [9].The simplest and most used way to quantify the degreecorrelation is given by the average degree of the nearestneighbors (nn) of a vertex i with degree k i [23], k nn,i = 1 k i (cid:88) j ∈N ( i ) k j , (3)where the sum runs over by the nearest neighbors verticesof i , represented by the set N ( i ). The degree correlationis obtained by the average degree of the nearest neigh-bors, k nn ( k ), for vertices of degree k [26]. That is, k nn ( k ) = 1 N k (cid:88) i | k i = k k nn,i , (4)where N k is the number of nodes of degree k and thesum runs over all vertices with the same degree k . Thisquantity is related to the correlations between the degreesof connected nodes because in average it can be expressedas k nn ( k ) = (cid:88) k (cid:48) k (cid:48) P ( k (cid:48) | k ) , (5)where P ( k (cid:48) | k ) is the probability of a node with degree k to have a neighbour node with degree k (cid:48) . If degrees ofneighboring vertices are uncorrelated, P ( k (cid:48) | k ) is just afunction of k (cid:48) and k nn ( k ) is a constant. If k nn increaseswith k then vertices with high degrees have a larger like-lihood of being connected to each other. If k nn decreaseswith k , high degree vertices have larger probabilities ofhave neighbors with low degrees [26, 27].The BA network is weakly disassortative as we showedin the ﬁgure 3. We observe that the preferential attach-ment interferes just in the connectivity of nodes recentlyadded in the network. According to the rule of the model,these nodes connect preferably with hubs, creating a dis-assortative correlation for small values of k . But, as longas the degree grows, the network becomes almost uncor-related.We also can use the Pearson coeﬃcient, named c P ,to quantify degree correlations, according to the expres-sion [27]: c P = (cid:80) e j e k e /E − [ (cid:80) e ( j e + k e ) / (2 E )] [ (cid:80) e ( j e + k e ) / (2 E )] − [ (cid:80) e ( j e + k e ) / (2 E )] , (6) k k nn ( k ) Figure 3: Degree correlation measured through the nearest-neighbors degree. It was used a BA network with size N = 10 , and averaged over 10 samples. The preferentialattachment rule of the BA networks aﬀects just the connec-tivity of nodes recently added in the network, the ones withsmall k . They connect primarily with hubs, creating a dis-assortative correlation for small values of k . As long as thedegree grows, the network becomes almost uncorrelated. where j e and k e are the degrees of the nodes that are inthe beginning and in the end of the edge e , and E is thetotal number of connections. This quantity ranges from − k nn ( k ) measure. While the latter provides how the de-gree correlation can vary with k , the Pearson coeﬃcient( c P ) quantiﬁes the degree correlation of the entire net-work according to a scale ranging from -1 to 1. This mea-sure was also used to complement the characterization ofa topological phase transition on growth and preferentialattachment model that consider the euclidean distancebetween the nodes, as we will see in section II D. In addi-tion, it will be useful to compare the synthetic networkswith real ones, in section III.All the main information of the BA network is summa-rized in the table I, as well the information about othernetworks that were also treated in this paper. In gen-eral, real networks present a power law degree distribu-tion with 2 < γ < γ ≈

3. Next, we show other features thatcan be added to the model to make it more realistic.

B. Fitness Model: Bianconi-Barab´asi Network

The original BA model produces a power-law networkwith the presence of sites that become privileged that is,with more connections over time. But this model doesnot taking into account the competitiveness, this means,the ability of younger nodes to acquire new neighbors.Facebook, for example, has become one of the most vis-ited sites in a short period of time when compared to theGoogle, an older search website. Another example is thegrowth of corporations where some newer ones concen-

Table I: Table with all main informations of the networks that were investigated in this work. The mean clustering coeﬃcient (cid:104) C (cid:105) and the Pearson correlation coeﬃcient c P are obtained for a sample of 1000 networks with size N = 10 . The averageshortest path length (cid:104) d (cid:105) is obtained for a sample of at least 20 networks with the same size. For networks with Euclideandistance we consider always α A = 3, since the topological phase transition occurs for α A ≈ . The preferential attachment ruleis shown in the column (cid:81) ( k i | j ). P(k) is the degree distribution form of each model, and γ is the exponent related to a powerlaw degree distribution that characterizes the ﬁrst three networks that were investigated.Network Π( k i | j ) P ( k ) γ (cid:104) C (cid:105) (cid:104) d (cid:105) c p Barab´asi-Albert k i (cid:80) n k n ∼ k − γ η i k i (cid:80) n k n ∼ k − γ (1 −A ij ) k i (cid:80) n (1 −A in ) k n ∼ k − γ k i r − αAij (cid:80) n k i r − αAin ∼ e − k/κq - 0.0019(2) 4.7(1) 0.034(7)Fitness Model η i k i r − αAij (cid:80) n η n k i r − αAin e − k/κq - 0.0034(4) 4.6(1) -0.046(8)with euclidean distanceHomophilic Model (1 −A ij ) k i r − αAij (cid:80) n (1 −A in ) k i r − αAin e − k/κq - 0.0020(2) 4.7(1) 0.028(7)with euclidean distance trate more services than older ones. We can simulate thissituation including a intrinsic characteristic in each node,called ﬁtness. In social networks, ﬁtness would representan individual’s attribute of becoming more popular dueto some quality of him/her. In networks, it represent theprobability of a node to become a hub quickly.This characteristic was observed in real networks byBianconi and Barab`asi, who later proposed an alterna-tive model including the ﬁtness factor η i of each node i [2]. The algorithm is similar to the BA network, buteach site connects to an existing node on the networkwith a probability that, in addition to depending on k connectivity, is also proportional to the attractiveness η i . The choice of η i ∈ [0 ,

1] is usually given by a uniformdistribution ρ ( η i ) [2], and the connection probability isdeﬁned by: Π( k i | j ) = η i k i (cid:80) n η n k n . (7)When the ﬁtness parameter is imposed, the networkremains a power-law degree distribution but with an ex-ponent less than 3 (see ﬁgure 4). According to the liter-ature, γ = 2 .

25 in the thermodynamic limit. There aremore privileged sites and, consequently, more hubs thanthe BA network, which makes the γ exponent smaller,that is, the network is more heterogeneous. In the inset ofﬁgure 4 we show the degree correlation measured throughthe nearest-neighbors degree for the Fitness model. Thebehavior is similar to that one we found for BA net-work. We also calculate the mean clustering coeﬃcient (cid:104) C (cid:105) , the average shortest path length (cid:104) d (cid:105) , and the Pear-son correlation coeﬃcient c P . These informations areshown in table I. We observed that, when compared to BA model, this network presents a higher cluster coeﬃ-cient and a lower Pearson correlation coeﬃcient, but theaverage shortest path length pretty does not change. k k nn ( k ) k -6 -4 -2 P ( k ) Figure 4: Distribution of connectivity for Bianconi-Barab`asimodel with m = 3 , N = 10 based on 1000 network realiza-tions. The graph is on the log-log scale. The dashed line, withslope P ( k ) ∼ k − . , is a guide for the eyes. This network hasmore privileged sites and, consequently, more hubs than theBA network, which explains its smaller value of γ . Inset:Degree correlation measured through the nearest-neighborsdegree for the same set of networks. The behavior is similarto that one we found for BA network. C. Homophilic Model

In a social network, people tend to relate to otherswho share common characteristics such as musical taste,football team, religion, and work. To take this tendencyinto account in social network models, we can include aconnection parameter called homophily.Almeida and colleagues [3] proposed a model introduc-ing this parameter through an intrinsic property value ofeach node, called η i , similar to the previous model. Thehomophily between any two nodes i and j , say A ij , isdeﬁned as the module of the diﬀerence between η i and η j , that is, A ij = | η i − η j | . The lower is A ij , the greaterthe aﬃnity between both and, consequently, the greaterthe probability to connect with each other. The proposedalgorithm is as follows: • It starts with m sites connected to each other, inthe same way as in the BA network, but introduc-ing a characteristic η i for each node, chosen ran-domly in a uniform distribution p ( η ) in the interval[0 , • For each time step, add a node j that links to other m nodes already on the network. Each j nodeconnects preferably to a node i according to theprobabilityΠ( k i | j ) = (1 − A ij ) k i (cid:80) n (1 − A in ) k n (8) • The procedure of the second item is repeated untilthe network reaches a previously established size N . k k nn ( k ) k -6 -4 -2 P ( k ) Figure 5: Distribution of connectivity for Homophilic modelnetwork. It was used networks with size N = 10 and m = 3,based on 1000 network realizations. The dashed line has slope P ( k ) ∼ k − . as a guide for the eyes. Its γ = 2 .

75 is smallerthan the γ = 3 . γ = 2 . The competition between the degree of connectivityand the aﬃnity between the nodes generates a networkwith a power law degree distribution, but with γ ∼ .

75, as we shown in ﬁgure 5. This value is lesser than theexponent obtained in the BA model ( γ = 3 .

0) but itis greater than the value obtained in the Fitness net-work ( γ = 2 . j can assumea value of η j = 0 .

5, for example. When this happens, if ittries to connect to a node i that has η i = 0 . k which has η k = 0 .

7, the aﬃnities between bothpairs ij and jk are the same. So, in this case, accordingto the expression (8) who dictates the preference in theconnection is the degree of the node [3].In the inset of ﬁgure 5 we show the degree correla-tion measured through the nearest-neighbors degree forthe Homophilic model. The behavior is similar to thatone we found for the other networks. We also calculatethe mean clustering coeﬃcient (cid:104) C (cid:105) , the average shortestpath length (cid:104) d (cid:105) , and the Pearson correlation coeﬃcient c P . These informations are shown in table I. We ob-served that, when compared to BA model, this networkpresents almost the same (cid:104) d (cid:105) and c P but a slightly higherclustering coeﬃcient. D. Networks with Euclidean distance

The models presented above do not take into accountthe spatial distance between the agents that compose thenetwork. But in many real systems, that variable playsan important role. For example, in the city model pro-posed by Ribeiro and colaborators [28], the authors ob-served how the Euclidean distance inﬂuences the poten-tial of cities and in scale’s law to measure socio-economicand infrastructure indicators. There are other works thatshowed the relation between social interaction and spa-tial properties [20, 21, 29]. For example, in the paper[20], the authors analyzed online social networks andthey found that spatial distance restricts who interactswith whom and denser connected groups tend to ariseat shorter spatial distances. In the following subsectionswe reconstructed the standard models shown previouslyincluding euclidean distance between the nodes as an at-tachment ingredient.

1. Model proposed by Soares et. al

Soares and colleagues [4] built a model in which thepreferred connection dynamics happens according to thedegree of connectivity but also considers the Euclideandistance between the nodes. To build the model, we con-sider that each site is inserted on a continuous plane. -4 -3 -2 -1 0 1 20123

Figure 6: An example of a network with size N = 20 gener-ated according to the algorithm proposed by Soares et. al. [4]using α A = 2 and α G = 2. Note that the nearest links aremore likely but long-range connections can also happen. The ﬁrst node is added at an arbitrary distance from theorigin and the others are isotropically distributed witha probability P G ( r ) ∝ r − (2+ α G ) , which depends on thedistance r from the center of mass, which is positionedat r cm from the origin and is re-calculate at each timestep. The exponent α G ( G refers to the word growth) isresponsible for the network growth, that is, deﬁnes howclose or distant the nodes will be placed. The calcu-lation of the position of the center of mass is given by r cm = M (cid:80) Nn =1 m n r n where m n is the mass of the node n , and r n is the vector-distance of this node to the origin,and M = (cid:80) Nn =1 m n is the total mass. The network hasa total of N nodes, and we can consider each node withmass m n = 1, so we have r cm = N (cid:80) Nn =1 r n . Each newsite j connects to a pre-existing node i following a ruleof preferential connection that depends on the distancebetween them, r ij and the degree of connectivity of thenode i , that is, Π( k i | j ) = k i r − α A ij (cid:80) n k n r − α A in . (9)The α A exponent ( A refers to the word attachment)controls the inﬂuence of spatial distance between sitesin the preferential attachment. If α A = 0, we recoverthe BA network that does not take into account the spa-tial distance between the nodes. This model preservesthe preferential attachment according to the degree ofthe nodes, but also taking into account the geographi-cal distance as a criterion to dispute for links. In ﬁgure6 we show a plot of an example of a network generatedaccording to this algorithm.Numerical results show that the parameter α G doesnot aﬀect the behavior of the connectivity distribution P ( k ) of the network (see Figure 7(b)). This parameterrefers just to the distance distribution in relation to thecenter of mass, and acts only on size scale but not onthe structure of the network, and consequently it does not impact on the preferential attachment rules. On theother hand, as α A increases, the connectivity distributionchanges (see ﬁgure 7(a)). Soares et. al. [4] showed thatthe degree distributions of networks generated accordingto their model are very well ﬁtted with the form P ( k ) = P (0) e − k/k q (10)where k > P (0) is a constant to be normalized, q is the entropicindex and e xq is the q - exponential deﬁned by e xq ≡ [1 + (1 − q ) x ] / (1 − q ) , (11)where the natural exponential function is a particularcase: e x = e xq =1 .The authors [4] showed that both k and q are func-tions of α A . So, as α A increases, a topological phasetransition occurs in the connectivity distribution [4, 30].The network changes from a completely heterogeneousnetwork ( α A = 0) to an increasingly homogeneous net-work as α A tends to inﬁnity. Such phase transition alsoappears in the degree correlation of the nodes, as we showin the calculation of k nn ( k ) for diﬀerent values of α A (seeﬁgure 8(a)). The transition is clearer in the graph 8(b)in which we show the calculation of Pearson’s coeﬃcientas a function of α a . Close to α a = 2, Pearson’s coeﬃ-cient changes from a negative value, which characterizesa disassociative network, to a positive value, which char-acterizes an associative network.Finally, other two more evidence that the topologicalphase transiton can be discussed. We can measure thelevel of heterogeneity of a network using the quantity κ = (cid:104) k (cid:105) / (cid:104) k (cid:105) , where (cid:104) k p (cid:105) is the p − th moment of the de-gree distribution. If κ/ (cid:104) k (cid:105) > N → ∞ whilefor homogeneous networks κ/ (cid:104) k (cid:105) ≈ α A increases because κ/ (cid:104) k (cid:105) decreases and approaches toone. We also calculated the average shortest path length.When the network becomes more homogeneous, the av-erage shortest path length increases because the numberof hubs decreases and consequently the path between thenodes increases. This measure, also shown in ﬁgure 9,reinforces the topological phase transition.

2. Variations of the model proposed by Soares et. al.

It is possible to include euclidian distance in the ho-mophilic and the ﬁtness models, as investigated by Nunesand collaborators [30]. For example, when we study thesocial interaciton in a city [28], the parameter η i can rep-resent the inﬂuence of diﬀerent places localized in thecity. So it is possible to use the ﬁtness model with Eu-clidean distance to try to explain, for example, why someplaces in a city is more attractiveness than others to open k -5 -4 -3 -2 -1 P ( k ) α a = 0 α a = 1 α a = 2 α a = 3 α a = 4 α a = 5 k -6 -4 -2 P ( k ) α G = 0 α G = 1 α G = 2 α G = 3 α a = 4 α a = 5 (a) (b)Figure 7: (a) Distribution of connectivity for diﬀerent values of α A and a ﬁxed α G = 2. By varying α A , a topological transitionappears close to α A = 2. For α A = 0, the BA network is recovered, meaning that spatial distance between nodes is not takeninto account. The network changes from a completely heterogeneous network ( α A = 0) to an increasingly homogeneous networkas α A tends to inﬁnity. (b) Distribution of connectivity for diﬀerent values of α G and a ﬁxed α A = 2 (right). By varying α G the distribution does not change signiﬁcantly. Results obtained with networks of size N = 10 , averaged over 1000 samples. k k nn α a = 0 α a = 1 α a = 2 α a = 3 α a = 4 α a -0,0500,050,10,150,2 P ea r s on ’ s c o e ff i c i e n t (a) (b)Figure 8: Degree correlation for a network of size N = 10 and diﬀerent values of α A . In ﬁgure (a) we show the measure of k nn in function of k . Highlighted the change from a weakly disassociative regime to an associative one. In ﬁgure (b),the Pearson’scoeﬃcient as a function of α A . Close to α a = 2, Pearson’s coeﬃcient changes from a negative value, which characterizes adisassociative network, to a positive value, which characterizes an associative network. In both (a) and (b) the average wasmade over 1000 samples. a store, coﬀee shop or gas station. We also can use thehomophilic model including spatial distance to study theinﬂuence of the topology in a formation of neighborhoods,since people tend to cluster with people that have a sim-ilar social class, religion or workplace [17, 31, 32].The algorithms used to construct both models werealready shown in previous sections. Now, we just haveto include the metric, using the function P G ∼ r − ( α G +2) to distribute the nodes in a continuous plane and changethe preferential attachment rules that become,Π( k i | j ) = η i k i r − α A ij (cid:80) n η n k n r − α A in and (12) Π( k i | j ) = (1 − A ij ) k i r − α A ij (cid:80) n (1 − A in ) k n r − α A in , (13)for ﬁtness and homophilic models, respectively.Nunes [30] also shown a topological phase transi-tion, as α A increases for ﬁtness model and no inﬂuenceof the parameter α G in the pattern of the connectivitydistribution. We obtained the same results for ho-mophilic networks. The data are not shown becausethey are very similar to the results shown in ﬁgure 7. α a Shorthest Average Path κ/ Figure 9: Measure of network heterogeneity deﬁned by κ/ (cid:104) k (cid:105) (blue line). If κ/ (cid:104) k (cid:105) > α A increases. We alsoplot the calculation of the shortest average path (red line).When the network becomes more homogeneous, the shortestaverage path increases, as the hubs disappear and then thedistance (number of links that connects any pair of nodes) be-tween the nodes increases. This measure also reinforces thetopological phase transition. We performed 1000 samples ofnetworks with size N = 10 . III. REAL NETWORKS

Most of social networks are assortative while techno-logical ones tend to be more disassociative [6, 9]. Tosupport this evidence and show that the models studiedin this article are successful in modeling real systems,we have chosen three real networks to investigated twodistinct properties: degree distribution and assortativity.The networks are: • Phone Calls: nodes represent cell phone users andthe edges exist if they have called each other atleast once during the investigated period. Data arefrom [11]. • Collaboration network: each node represents an au-thor in a scientiﬁc collaboration and the edges be-tween them represent a co-authored at least onepaper in the period from January 1993 to April2003. The data are obtained from arXiv preprintCondense Matter Physics [13]. • Email: nodes are email adress and a directed linkfrom one node to another represent at least oneemail sent. The data are collected during 112 daysin the University of Kiel (Germany) [12].According to the table II, the Pearson’s coeﬃcient ofphone calls and collaboration networks are positive whilefor email network this coeﬃcient is negative. The ﬁrsttwo are social networks and they are basically related tofamily/friendship and professional interactions, respec-tively. In reference [9], Newman found similar results for biology and mathematics coauthorship. However, theemail network, although it also describes some social in-teraction, behaves more as a technological network. Inthe reference cited above, the author also found similarvalue for World-Wide-Web.

Table II: Size N and Pearson’s correlation coeﬃcient r fordiﬀerent real networks. We compared the values with thePearson’s coeﬃcient calculated for synthetic networks withthe same size. For the phone calls network, we used the ho-mophilic network including euclidean distance ( α A = 5). Forthe collaboration network, we used the BA network includ-ing euclidean distance ( α A = 5) and ﬁnally for the emailnetwork, we used the ﬁtness network including euclidean dis-tance ( α A = 1).Real Network N c P c P (synthetic network)Phone Calls 36594 0.282 0.120Collaboration 23132 0.134 0.112Email 57194 -0.075 - 0.078 Now, we can compare this real systems with our in-vestigated models. In the case of phone calls network,we used the homophilic model and we investigated howthe euclidean distance between the nodes of the systemaﬀects the network’s degree distribution. Homophilicmodel was chosen because it is reasonable to assume thattelephone calls happen between people who have a cer-tain aﬃnity with each other, whether for personal, familyor professional reasons. This hypothesis is corroboratedin recent works [16, 17].The Pearson correlation coeﬃcient of the investigatedsynthetic network is not very similar to the value ob-tained for the real network (see table II). However weappreciated how accurate the degree distribution of thissynthetic network is when compared to the real one. Asshown in ﬁgure 10, when we used the attachment pa-rameter α A = 5, the ﬁts works extremely well, empha-sizing the importance of considering geographic distancebetween the elements of the system when modeling realsocial networks. Indeed, a lot of work have followed thisline [20, 21, 29].The same analysis can be done for the collaborationnetwork. The Pearson correlation coeﬃcient of the syn-thetic network is similar to the one calculated for thereal system. In this study, the only change was the syn-thetic network investigated. We chosed the traditionalBarab`asi-Albert model but also including the Euclideandistance and, as we showed in ﬁgure 11, the ﬁt using α A = 5 is accurate as well. In networks of scientiﬁc co-authorship more distinguished researchers, such as uni-versity professors, tend to publish works with less famousresearchers, such as their graduate students. This sup-port both assumptions: the BA preferential attachmentrule according to the degree of the node and the inﬂu-ence of the distance between the elements of the system.For the collaboration network, the ﬁtness and homophilicmodels also showed reasonable results. As long as we in- -5 -4 -3 -2 -1 P ( k ) k10 -5 -4 -3 -2 -1 P ( k ) k10 k10 α A = 0 α A = 1 α A = 2 α A = 3 α A = 4 α A = 5 Figure 10: Degree distribution of a phone calls network com-pared with the distinct degree distributions of synthetic net-works with the same size generated according to the ho-mophilic model including Euclidean distance. Black pointsrepresent real network data and solid colored lines are relatedto synthetic networks. The synthetic network with α A = 5ﬁts better the data. creased the value of α A , the preferential connection ruleinvolving the Euclidean distance prevails in relation tothe others. -5 -4 -3 -2 -1 P ( k ) k10 -5 -4 -3 -2 -1 P ( k ) k10 k10 α A = 0 α A = 1 α A = 2 α A = 3 α A = 4 α A = 5 Figure 11: Degree distribution of a collaboration networkcompared with the distinct degree distributions of syntheticnetworks with the same size generated according to theBarab`asi-Albert model including Euclidean distance. We ob-serve that the last scenario ( α A = 5) ﬁts better the real data.Black points represent real network data and solid coloredlines are related to synthetic networks. Finally, the email network presents a very similar Pear-son correlation coeﬃcient with compared to the ﬁtnesssynthetic network considering the euclidean distante.But here the parameter α A = 1 ﬁts better the degreedistribution of real data. It shows a smaller impact ofthe geographical distance of the nodes in technologicalnetworks than in social ones. This can also be related to the fact that this real network has directed links. As thisemail network was obtained from a university, the ﬁtnessmodel was chosen based on the fact that, in academia,students tend to send more emails to teachers than theotherwise. So the message sent depends on how inﬂuen-tial (greater ﬁtness) the reciever is. -6 -4 -2 P ( k ) -6 k10 -6 -4 -2 P ( k ) k10 k10 -6 α A = 0 α A = 1 α A = 2 α A = 3 α A = 4 α A = 5 Figure 12: Degree distribution of an email network comparedwith the distinct degree distributions of synthetic networkswith the same size generated according to the ﬁtness modelincluding Euclidean distance. We observe that the secondscenario ( α A = 1) ﬁts better the real data. Black pointsrepresent real network data and solid colored lines are relatedto synthetic networks. IV. CONCLUSIONS

In this work we have studied network models withgrowth and diﬀerent rules of preferential attachment.We reviewed some important algorithms such as theBarabasi-Albert model and others that includes ﬁtness,homophily and/or Euclidean distance as strategies tomake connections between nodes. From an applicableperspective, these models are useful to model real-worldnetworks because they present characteristics found insocial sytems and also in technological ones, as we showedin the last section.Our results corroborated with evidences that power-law degree structured is not very common in real systems.We evaluated two social and one technological networkand we compared the degree distribution of these net-works to degree distributions generated by growth andpreferential attachment models. Our main conclusion isthat the real networks analysed are better ﬁtted withmodels which consider traits as ﬁtness, homophily andeuclidean distance between nodes. We observed that ge-ographic distance between nodes seems to be an impor-tant factor to model specially real social systems. Thisfeature changes the form of the degree distribution ofa power law to a q-exponential according to the modelproposed by Soares and collaborators [4]. Our results0are in agreement with recent studies involving real net-works [7, 11–15, 20, 21, 29, 33].We also supplemented the characterization of thesesynthetic networks investigating measures as clustering,average shortest path length, degree distribution and as-sortativity.Finally, it is important to mention that many dynam-ical processes as epidemics, rumor propagation and syn-chronization were extensively investigated in scale-freeto- pologies as the Barab`asi-Albert network. Howeverthe study of these dynamics in substrates where the dis-tance between the elements of the system is taken intoaccount needs to further advance, since this element hasalready been shown to be very important. Even on onlinesocial networks [16, 17, 19–21, 29], it seems to inﬂuence the connection between the nodes, as well as ﬁtness andhomophily.

Acknowledgments

This work was partially supported by the Brazilianagencies CAPES, CNPq and FAPEMIG. The authors ac-knowledges computational time at DFI-UFLA. Ang´elicaS. Mata thanks the support from FAPEMIG (Grant No.APQ-02482-18) and CNPq (Grant No. 423185/2018-7). Fabiano L. Ribeiro thanks the support from CNPq(405921/2016-0) and CAPES (88881.119533/2016-01). [1] A.-L. Barab´asi and R. Albert, science , 509 (1999).[2] G. Bianconi and A.-L. Barab´asi, EPL (Europhysics Let-ters) , 436 (2001).[3] de Almeida Maur´ıcio L. et. al. , The European PhysicalJournal B-Condensed Matter and Complex Systems ,38 (2013).[4] D. J. Soares, C. Tsallis, A. M. Mariz, and L. R. da Silva,EPL (Europhysics Letters) , 70 (2005).[5] F. Radicchi, Nature Physics , 597 (2015).[6] M. E. J. Newman and J. Park, Phys. Rev. E ,036122 (2003), URL https://link.aps.org/doi/10.1103/PhysRevE.68.036122 .[7] C. Radicchi, Filippo and Claudio, Nature Communica-tions , 10196 (2015), URL https://doi.org/10.1038/ncomms10196 .[8] G. Caldarelli, Scale-free networks: complex webs in na-ture and technology (Oxford University Press, 2007).[9] M. E. J. Newman, Phys. Rev. Lett. , 208701(2002), URL https://link.aps.org/doi/10.1103/PhysRevLett.89.208701 .[10] A. Barab´asi and M. P˜asfai, Network Science (Cam-bridge University Press, 2016), ISBN 9781107076266,URL https://books.google.com.br/books?id=iLtGDQAAQBAJ .[11] C. Song, Z. Qu, N. Blumm, and A.-L. Barab´asi,Science , 1018 (2010), ISSN 0036-8075,https://science.sciencemag.org/content/327/5968/1018.full.pdf,URL https://science.sciencemag.org/content/327/5968/1018 .[12] H. Ebel, L.-I. Mielsch, and S. Bornholdt, Phys. Rev.E , 035103 (2002), URL https://link.aps.org/doi/10.1103/PhysRevE.66.035103 .[13] J. Leskovec, J. Kleinberg, and C. Faloutsos, ACM Trans.Knowl. Discov. Data , 2–es (2007), ISSN 1556-4681,URL https://doi.org/10.1145/1217299.1217301 .[14] A. D. Broido and A. Clauset, Nature communications , 1 (2019).[15] P. Holme, Nature communications , 1 (2019).[16] S. Currarini, J. Matheson, and F. Vega-Redondo, Euro-pean Economic Review , 18 (2016).[17] H. Bisgin, N. Agarwal, and X. Xu, in Web Intelli-gence and Intelligent Agent Technology (WI-IAT), 2010IEEE/WIC/ACM International Conference on (IEEE, 2010), vol. 1, pp. 533–536.[18] A. F. Rozenfeld, R. Cohen, D. ben Avraham, andS. Havlin, Phys. Rev. Lett. , 218701 (2002), URL https://link.aps.org/doi/10.1103/PhysRevLett.89.218701 .[19] B. A. C. H. L. W. Y. Q. X. L. X. Liu, L.; Chen, ISPRSInt. J. Geo-Inf (2018).[20] V. Y. S. S. M. C. Laniado, D. and A. Kaltenbrun-ner, Information Systems Frontiers (2017), URL https://doi.org/10.1007/s10796-017-9784-9 .[21] B. Lengyel, A. Varga, B. S´agv´ari, ´A. Jakobi, andJ. Kert´esz, PLoS ONE (2015).[22] B. Bollob´as and O. M. Riordan, Handbook of graphsand networks: from the genome to the internet pp. 1–34 (2003).[23] S. N. Dorogovtsev and J. F. Mendes, Advances in physics , 1079 (2002).[24] D. J. Watts and S. H. Strogatz, Nature , 440 (1998).[25] S. N. Dorogovtsev and J. F. F. Mendes, CoRR cond-mat/0404593 (2004), URL http://arxiv.org/abs/cond-mat/0404593 .[26] R. Pastor-Satorras, A. V´azquez, and A. Vespignani,Physical review letters , 258701 (2001).[27] A. Barrat, M. Barthelemy, and A. Vespignani, Dynami-cal processes on complex networks (Cambridge universitypress, 2008).[28] F. L. Ribeiro, J. Meirelles, F. F. Ferreira, and C. R. Neto,Royal Society open science , 160926 (2017).[29] X. Liu, Y. Xu, and X. Ye, Outlook and Next Steps: In-tegrating Social Network and Spatial Analyses for Ur-ban Research in the New Data Environment (SpringerInternational Publishing, Cham, 2019), pp. 227–238,ISBN 978-3-319-95351-9, URL https://doi.org/10.1007/978-3-319-95351-9_13 .[30] T. C. Nunes, S. Brito, L. R. da Silva, and C. Tsallis,Journal of Statistical Mechanics: Theory and Experi-ment , 093402 (2017).[31] Vinicius M. Netto, Joao Meirelles and F. L. Ribeiro,Complexity (2017).[32] V. M. Netto, E. Brigatti, J. Meirelles, F. L. Ribeiro,B. Pace, C. Cacholas, and P. Sanches, Entropy , 1(2018), ISSN 10994300.[33] F. F. L. Ribeiro, Revista Brasileira de Ensino de Fisica39