[PDF] Quantifying China's Regional Economic Complexity

Abstract

China has experienced an outstanding economic expansion during the past decades, however, literature on non-monetary metrics that reveal the status of China's regional economic development are still lacking. In this paper, we fill this gap by quantifying the economic complexity of China's provinces through analyzing 25 years' firm data. First, we estimate the regional economic complexity index (ECI), and show that the overall time evolution of provinces' ECI is relatively stable and slow. Then, after linking ECI to the economic development and the income inequality, we find that the explanatory power of ECI is positive for the former but negative for the latter. Next, we compare different measures of economic diversity and explore their relationships with monetary macroeconomic indicators. Results show that the ECI index and the non-linear iteration based Fitness index are comparative, and they both have stronger explanatory power than other benchmark measures. Further multivariate regressions suggest the robustness of our results after controlling other socioeconomic factors. Our work moves forward a step towards better understanding China's regional economic development and non-monetary macroeconomic indicators.

Full PDF

aa r X i v : . [ q -f i n . E C ] N ov Quantifying China’s Regional Economic Complexity

Jian Gao a,b, ∗ , Tao Zhou a,b, ∗∗ a CompleX Lab, Web Sciences Center, University of Electronic Science and Technology of China, Chengdu 611731, People’s Republic of China b Big Data Research Center, University of Electronic Science and Technology of China, Chengdu 611731, People’s Republic of China

Abstract

China has experienced an outstanding economic expansion during the past decades, however, literature on non-monetary metrics that reveal the status of China’s regional economic development are still lacking. In this paper, weﬁll this gap by quantifying the economic complexity of China’s provinces through analyzing 25 years’ ﬁrm data. First,we estimate the regional economic complexity index (ECI), and show that the overall time evolution of provinces’ECI is relatively stable and slow. Then, after linking ECI to the economic development and the income inequality, weﬁnd that the explanatory power of ECI is positive for the former but negative for the latter. Next, we compare di ﬀ erentmeasures of economic diversity and explore their relationships with monetary macroeconomic indicators. Resultsshow that the ECI index and the non-linear iteration based Fitness index are comparative, and they both have strongerexplanatory power than other benchmark measures. Further multivariate regressions suggest the robustness of ourresults after controlling other socioeconomic factors. Our work moves forward a step towards better understandingChina’s regional economic development and non-monetary macroeconomic indicators. Keywords:

Economic complexity, Non-linear science, Economic development, Network science, Entropy

1. Introduction

Understanding how economies develop to prosperity and ﬁguring out the best indicators that reveal the statusof economic development are long-standing challenges in economics [1, 2], which have far-reaching implicationsto practical applications. Traditional macro-economic indicators, like Gross Domestic Product (GDP), are widelyapplied to reveal the status of economic development, however, calculating these economic census-based indicatorsare usually costly, resources consuming and following a long time delay [3]. Thanks to the data revolution of the pastdecades [4], a branch of economic research has been moving to data-driven approaches within the methodology ofnatural science, statistical physics and complexity sciences [5, 6, 7], which makes it possible to introduce new metricsthat surpass the traditional economic measures in revealing current economic status and predicting future economicgrowth, with applications to economic development [8, 9], trading behavior [10], poverty [11, 12], inequality [13, 14],unemployment [15, 16], and industrial structure [9, 17]. Economists and physicists have also introduced a varietyof non-monetary metrics to quantitatively assess the country’s economic diversity and competitiveness by measuringintangible assets of the economic system [18, 19], allowing for quantifying the economies’ hidden potential for futuredevelopment [20, 21] in near real-time and at low cost.In recent decades, many works on quantifying the complexity of socioeconomic systems and ﬁnancial marketshave been done by physicists, who have helped to move research in economy forward by introducing physics-relatedapproaches and models into economic and ﬁnancial studies [22, 23]. In particular, as an interdisciplinary ﬁeld, theeconophysics [24, 25] applies theories and methods that originally developed by physicists to solve problems in eco-nomics and statistical ﬁnance [26]. Recently, econophysicists have proposed network measures to reveal the truerisks associated with institutions to make ﬁnancial markets more stable [27] and studied the complex correlationsand trend switchings in ﬁnancial time series [28]. Moreover, economists and physicists have applied network and ∗ E-mail addresses : [email protected] (J. Gao) ∗∗ E-mail addresses : [email protected] (T. Zhou)

Preprint submitted to arXiv November 15, 2017 tatistical methods to reshape the understanding of international trade that the knowledge about exporting to a desti-nation di ﬀ uses among related products and geographic neighbors [29]. Besides, some physical processes like iterativereﬁnement and resource-allocation have been widely applied to evaluate online reputation in socioeconomic systems[30, 31] and to build better recommender systems in e-commerce [32, 33]. More recent works on econophysics andcomplexity are summarized by review papers [34, 35, 36] and books [37, 38].Towards quantifying the complexity of a country’s economy, the pioneering attempt was made by Hidalgo andHausmann [18], who modeled the international trade ﬂows as “Country-Product” networks and derived the EconomicComplexity Index (ECI) by characterizing the network structure through a set of linear iterative equations, couplingthe diversity of a country (the number of products exported by that country) and the ubiquity of a product (the numberof countries exporting that product). The intuition behind this new branch of studies is that the cross-country incomedi ﬀ erences can be explained by di ﬀ erences in economic complexity, which is measured by the diversity of a country’s“capabilities” [18, 19]. Soon after, Tacchella et al. [39] developed a new statistical approach which deﬁnes a country’sFitness and a product’s complexity by the ﬁxed points of a set of non-linear iterative equations [40], where thecomplexity of products is bounded by the ﬁtness of the less competitive countries exporting them. Further, Cristelli etal. [21] studied the heterogeneous dynamics of economic complexity and found, in the ﬁtness-income plane, strongexplanatory power of economic development in the laminar regime and weak explanatory power in the chaotic regime.Based on this observation, they argued that regressions are inappropriate in dealing with this heterogeneous scenarioof economic development and further proposed a selective predictability scheme to predict the evolution of countries.Nevertheless, these economic complexity indicators are not perfect, for example, ECI su ﬀ ers from criticisms on itsself-consistent, Fitness depends on the dimension of the phase space of the heterogeneous dynamics of economiccomplexity [20, 21], and a new variant of Fitness method, called minimal extremal metric, can perform even better iffor a noise-free dataset [41]. Recently, Mariani et al. [42] quantitatively compared the ability of ECI and Fitness inranking countries and products, and further investigated a generalization of the “Fitness-Complexity” metric.Even though there is a body of literature on inferring complexity [43, 44] using cross-country data recordingworld trade ﬂows [45, 46], studies on China’s regional economic complexity using ﬁrm level data are still missing.On the one hand, previous studies mainly focus on measuring international level economic competitiveness while theregional level complexity within a country is always ignored. In other words, whether the economic complexity canbe successfully extended and tested across di ﬀ erent scales is still unknown. One the other hand, most of previouseconomic complexity analysis are based on the world trade data [18, 39], meaning that industries without exportingproducts are excluded, such as services. However, not only goods but also services are important to measure economiccomplexity as the growth in service and its sophistication can provide an additional route for economic growth [47].Moreover, China has experienced a great economic expansion during the past decades. However, some questionsregarding China’s development are still puzzling, for example, how did China grow [48], what happened to regionaldevelopment within China [9, 49], which metric to use in measuring regional economic complexity [42], and whatis the predictive power of complexity to regional development and inequality [14, 50]. Fortunately, the developmentstatistical methods (for example, methodology contributed by econophysicists) and the availability of China’s ﬁrmlevel data (dataset that includes all types of industries) provide us a promising way to explore the regional economiccomplexity within a country, and o ﬀ ers us a chance to explore how the non-monetary economic complexity correlateswith traditional monetary macroeconomic indicators at the regional level.In this paper, we study China’s regional economic complexity by analyzing publicly listed ﬁrm data from 1990to 2015. We start by estimating the Economic Complexity Index (ECI) of China’s provinces based on the structureof the “Province-Industry” network. We show that diversiﬁed provinces tend to have industries of less ubiquity, andthe overall time evolution of the provinces’ rankings by ECI is relatively stable and slow, with provinces locatedalong the coast having higher economic complexity. Then, after linking complexity with the economic developmentand income inequality, we ﬁnd that ECI is a positive and signiﬁcant indicator of economic development with higherexplanatory power for provinces of lower level of GDP per capita (GDP pc) that located in laminar regime of ECI-ln(GDP pc) plane compared to provinces of high level of GDP pc that located in chaotic regime. Together, ECI ﬁnds anegative and signiﬁcant explanatory power for regional income inequality of China. Moreover, we compare di ﬀ erentmeasures of economic diversity and explore their relationships with monetary macroeconomic indicators. Resultssuggest that Fitness is comparative with ECI, and they both perform better than Diversity and Entropy in correlatingGDP pc. Further, we show the predictive powers of ECI and Fitness are robust by using multivariate regressions aftercontrolling other socioeconomic factors. Our work contributes to the literature of regional economic complexity.2he paper is organised in the following way. Section 2 introduces the data and the implementation of economicdiversity metrics. Section 3 presents the results of China’s regional economic complexity and its connections withincome inequality. Finally, Section 4 provides conclusions and discussion.

2. Data and Methods

We study the regional economic complexity by using China’s publicly listed ﬁrm data, which were extracted fromthe RESSET Financial Research Database, provided by Beijing Gildata RESSET Data Tech Co., Ltd. (http: // // ﬀ erences (RICD) as an estimation, which is deﬁned by the ratio of RICU toRICR to measure the level of income inequality between urban and rural areas in China. For the population, we usethe resident population at year-end. For the urbanization metric, we use the share of urban area in a province as anestimation. For the metrics of schooling and innovation, we use the ratio of students in higher education in a provinceand the number of domestic granted patents, respectively. For the foreign trade, we use the total value of imports andexports of destinations and catchments. All these macroeconomic data except for income inequality were extractedfrom China Statistical Yearbook, published by the National Bureau of Statistics of China (http: // M p , i , where M p , i = p has the revealed comparative advantage(RCA) [51] in industry i (RCA p , i ≥

1) and 0 otherwise [18, 52]. Here, we deﬁne the RCA as the ratio between theobserved number of ﬁrms operating in an industry in a province and the expected number of ﬁrms of that industry inthat province. Formally, the RCA for province p in industry i is deﬁned asRCA p , i = x p , i P i ′ x p , i ′ , P p ′ x p ′ , i P i ′ P p ′ x p ′ , i ′ , (1)where x p , i is the number of ﬁrms in province p that operate in industry i . Further, the diversity of a province p isdeﬁned as the number of industries in which the province has the comparative advantage,Diversity = k p , = X i M p , i . (2)The ubiquity of industry i is deﬁned as the number of provinces with the comparative advantage in that industry,Ubiquity = k i , = X p M p , i . (3)Finally, the Economic Complexity Index (ECI) of province p is deﬁned asECI p = K p − h ~ K i std ( ~ K ) = m K p − m P p K p q m P p ( mK p − P p K p ) , (4)3 sed A P P P I I I I Province Industry ip x (cid:15) B BJTJ HESXNM LNJLHL SHJSZJAHFJJX SDHA HB HNGDGXHICQ SCGZYNXZ SNGSQH NX XJ p k U b i qu i t y ( k p , ) Diversity ( k p ,0 ) p k Pearson's r = -0.777 p = 2.8×10 -7 Figure 1: Quantifying regional economic complexity. ( A ) Illustration of a “Province-Industry” bipartite network. The weight of link x p , i is thenumber of ﬁrms in province p that operate in industry i . ( B ) The “Diversity-Ubiquity” diagram divided into four quadrants deﬁned by the averagingdiversity h k p , i and ubiquity h k p , i , as shown by the vertical and horizontal lines, respectively. The abbreviations of province names correspond toTable A1 in Appendix. where m is the number of provinces, h·i and std ( · ) are respectively functions of mean value and stand deviation thatoperate on the elements in vector ~ K , and ~ K is the eigenvector associated with the second largest eigenvalue of thematrix [40] ˜ M p , p ′ = k p , X i M p , i M p ′ , i k i , . (5)Indeed, the matrix ˜ M p , p ′ is deﬁned in terms of connecting provinces who have similar industries, weighted by theinverse of the ubiquity of an industry ( k i , ) and normalized by the diversity of a province ( k p , ) [19].The Fitness Index [39] is based on the idea that, i) a diversiﬁed province gives limited information on the com-plexity of industries, and ii) a poorly diversiﬁed province is more likely to have a speciﬁc industry of a low levelsophistication. Therefore, a non-linear iteration is needed to bound the complexity of industries by the ﬁtness of theless competitive provinces having them [20, 21]. Here, the Fitness of province is proportional to the number of itsindustries weighted by their complexity. In turn, the complexity of industry is inversely proportional to the number ofprovinces who have this industry (similar methods were early proposed to deal with recommender systems [31, 53].The coupling of the province p ’s Fitness ( F p ) to the industry i ’s complexity ( Q i ) is summarized in the followingnon-linear iterative scheme:  ˜ F ( n ) p = X i M p , i Q ( n − i ˜ Q ( n ) i = P p M p , i F ( n − p , (6)where ˜ F ( n ) p and ˜ Q ( n ) i are respectively normalized in each step by F ( n ) p = ˜ F ( n ) p / h ˜ F ( n ) p i and Q ( n ) i = ˜ Q ( n ) i / h ˜ Q ( n ) i i given theinitial condition F (0) p = Q (0) i =

1. The non-linear iterations go until the stationary state is reached, and the ﬁnalFitness value reﬂects complexity.

3. Results

In this section, we ﬁrst report the China’s regional economic complexity ECI and its time evolution. Then, weshow the heterogeneity of the levels of economic development in relation to the value of the economic complexity.Next, we provide simple comparisons among di ﬀ erent measures of economic diversity, especially ECI and Fitness,and show how they correlate other monetary macroeconomic indicators. Finally, we check the robustness of thepredictive power of the economic complexity for the economic growth by using multivariate regressions.4

000 2005 2010 2015 E C I ( ) -2.0 21111 Year

SHGDBJFJTJJSHISNZJSDJLHLSCLNGXHBGZCQHNXZYNXJHEAHJXHAGSSXNXNMQH R an k i ng ( E C I ) BA

31 21 11 13121111

BJTJHESX NM LN JL HL SHJSZJAH FJJXSDHA HBHN GDGX HICQSC GZYN XZ SNGSQHNX XJ

Pearson's r = 0.898 p = 7.4×10 -12 C R an k i ng ( E C I ) Ranking (ECI 2000)

Figure 2: China’s regional economic complexity and the evolution of provinces’ rankings. ( A ) Map of China’s regional Economic ComplexityIndex (ECI). The color denotes the value of ECI in 2015. ( B ) Time evolution of all provinces’ rankings by ECI from 2000 to 2015. ( C ) Relationshipbetween rankings by ECI in 2000 and 2015. The gray dash line is the diagonal line. The abbreviations of province names correspond to Table A1in Appendix. The Economic Complexity Index (ECI) measures the regional economic structure by combining province’s di-versity and industry’s ubiquity. To check the intuition behind ECI that sophisticated economies are diverse andhaving industries of low ubiquity, in Figure 1B we present the relationship between a province’s diversity ( k p , = P i M p , i ) and the averaging ubiquity of industries in which the province has the comparative advantage ( k p , = P i ( k i , M p , i ) / P i M p , i ). We ﬁnd a strong and signiﬁcant negative correlation between k p , and k p , with Pearson’scorrelation r = − .

777 ( p -value = . × − ), supporting the hypothesis that diversiﬁed provinces tend to have lessubiquitous industries.Figure 2A presents the values of China’s regional Economic Complexity Index (ECI) at province level in 2015. Weﬁnd that provinces located along the coast trend to have higher economic complexity, follow by provinces that locatedin Southwest and Northeast of China. Figure 2B shows the time evolution of the rankings of all provinces between2000 and 2015 by ECI. It can be seen that provinces with highest and lowest rankings are more stable during thatperiod, with Shanghai (SH), Guangdong (GD) and Beijing (BJ) ranked to the top, and Qinghai (QH), Inner Mongolia(NM) and Ningxia (NX) ranked to the bottom. For the middle rankings, economies in some provinces become moresophisticated such as Shandong (SD) and Fujian (FJ) while some provinces become less complex such as Shaanxi(SN) and Chongqing (CQ). Figure 2C compares all rankings by ECI at the starting year 2000 and the ending year2015. Provinces with an increased ECI rankings locate above the diagonal while provinces with decreased rankingslocate under the diagonal. We ﬁnd that the ECI rankings in 2015 are highly and signiﬁcantly correlated with those in2000 with the Pearson’s correlation r = .

898 ( p -value = . × − ), indicating that the overall time evolution ofECI rankings is relatively stable and slow. The Economic Complexity Index (ECI) is a non-monetary metric which is able to assess the level of developmentand competitiveness of provinces by measuring intangible assets of economic systems [18, 21, 39]. Naturally, weshould compare this metric for province intangibles with monetary metric, as the GDP pc, which is traditionally used5 -2 -1 0 1 2 389101112 -2 -1 0 1 2 38910

BJTJHESXNM LNJLHL SHJSZJAH FJJX SDHA HBHN GDGX HICQSCGZYNXZ SNGSQH NX XJ BJ TJHESX NM LN JL HL SHJSZJAH FJJXSDHA HBHN GDGX HICQSC GZYN XZ SNGSQH NXXJ l n ( G D P p c ) ( ) ECI (2015)

Pearson's r = 0.667 p = 4.1×10 -5 B C l n ( G D P p c ) ECI

SHGDBJFJTJJSHISNZJSDJLHLSCLNGXHBGZCQHNXZYNXJHEAHJXHAGSSXNXNMQH A l n ( G D P p c ) ( ) ECI (2000)

Pearson's r = 0.554 p = 1.2×10 -3 Figure 3: Relationship between economic developmet and economic complexity. ( A ) and ( B ) show the positions of provinces in plane of EconomicComplex Index (ECI) versus natural logarithm of GDP pc in 2015 and 2000, respectively. The gray dash line is the linear ﬁt of dots. ( C )Time evolution of locations of provinces in the ECI-ln(GDP pc) plane from 2000 to 2015. During this period, the better-going and worse-goingprovinces changed their ECIs, on average, with a value 0.49 and − .

40, respectively. The abbreviations of province names correspond to Table A1in Appendix. by economists in measuring the level of economic development. For static observations, Figure 3A and 3B showthe locations of provinces in the ECI-ln(GDP pc) plane for 2015 and 2010, respectively. We ﬁnd that the economiccomplexity is a positive and signiﬁcant indicator of economic development, as suggested by the high correlationbetween ECI and ln(GDP pc) with Pearson’s correlation r = .

667 ( p -value = . × − ) for 2015 and r = . p -value = . × − ) for 2000. Roughly speaking, provinces with larger economic complexity enjoy a higher levelof economic development.To further investigate how economic development depends on the complexity, we move from the static picturesto the dynamics of provinces in the compound ECI-ln(GDP pc) plane from 2000 to 2015. As shown in Figure 3C,the dynamics of the provinces in this plane is, to some extent, heterogeneous but with two emergent trends. On theleft and central sides, we observe a laminar regime, where ECI is linearly and positively correlated with ln(GDP pc),supporting that ECI is a driving force of economic growth [21]. Countries locating in this laminar regime enjoy a slowbut stable economic development. On the right side, we observe a chaotic regime, where the dynamics of provincesare less predictable due to the larger ﬂuctuations of ECI. However, countries locating in this chaotic regime developedmuch faster and achieved a higher level of economic development. For example, Shaanxi (SN) has a much higherECI than the other provinces with the same level of GDP pc (for example, QH, SC, JX, AH and YN) in 2000. Inthe last 15 years, the GDP pc of Shaaxi (SN) increased by a factor of 9.6, leading its ranking by GDP pc jumpedfrom 23 to 14. By comparison, the GDP pc of the other provinces with the same level of GDP pc only grew, onaverage, by a factor of 7.3. These results suggest that, in the case of China, ECI is a good indicator of future economicdevelopment for the provinces with a relatively low level of GDP pc, while the predictive power is reduced for theprovinces with a high level of GDP pc. Moreover, we notice that during the considered period the ECIs of someprovinces remarkably increased, for example, Shanghai (SH) from 2.01 to 2.49 and Guangdong (GD) from 0.97 to2.26 while the ECIs of some provinces decreased remarkably, for example, Tianjin (TJ) from 1.61 to 0.99 and Jiangxi(JX) from − .

27 to − .

81. On average, the ECIs of better-going province (with increasing in ECI) changed 0.49, andthe ECIs of worse-going provinces (with decreasing in ECI) changed − .

40 from 2000 to 2015. The result suggeststhat economic complexity and regional development are heterogeneous within a country.6 -2 -1 0 1 212345

BJTJHESXNM LNJLHL SHJSZJAH FJJX SDHA HBHN GDGX HICQ SCGZ YNXZ SNGSQH NX XJ BJTJHESXNM LNJLHL SHJSZJAH FJJX SDHA HBHN GDGX HICQ SCGZ YNXZ SNGSQH NX XJ

BJTJHESXNM LNJLHL SHJSZJAH FJJX SDHA HBHN GDGX HICQ SCGZ YNXZ SNGSQH NX XJ

Pearson's r = -0.413 p = 2.1×10 -2 Pearson's r = 0.531 p = 2.1×10 -3 R I CU ( ) ECI (2010) BA Pearson's r = 0.589 p = 4.8×10 -4 R I CR ( ) ECI (2010) C R I CD ( ) ECI (2010)

Figure 4: Relationship between economic complexity and income inequality. ( A ) and ( B ) are respectively for Economic Complexity Index (ECI)versus relative income at urban area (RICU) and at rural area (RICR) in 2010. ( C ) Relationship between ECI and relative income di ﬀ erences(RICD). As an estimation of income inequality, RICD is deﬁned by the ratio of RICU to RICR. The gray dash line is the linear ﬁt of dots. Theabbreviations of province names correspond to Table A1 in Appendix. The income inequality along with economic development has always been a central concern of economists andpolicy makers in economic theory and policy [54]. With the development of new perspective and economic tools,progresses have been made on explaining income inequality through new data and measures [14, 55, 56]. For exam-ple, Hartmann et al. [14] showed that the economic complexity can be a signiﬁcant and negative indicator of incomeinequality. We here explore the relationship between economic complexity and income inequality on regional levelwithin China. First, Figure 4A and 4B show how ECI correlates with the relative income at urban area (RICU) andat rural area (RICR), respectively. We ﬁnd a positive and signiﬁcant correlation between ECI and the relative incomeat both urban area (Pearson’s correlation r = .

531 with p -value = . × − ) and rural area (Pearson’s correlation r = .

589 with p -value = . × − ). Then, we use the relative income di ﬀ erences (RICD), deﬁned by the ratio ofRICU to RICR, as an estimation of income inequality and further show the relationship between ECI and RICD inFigure 4C. We ﬁnd a negative and signiﬁcant correlation between economic complexity and relative income di ﬀ er-ences (Pearson’s correlation r = − .

413 with p -value = . × − ), which is coincided with previous ﬁndings basedon international trade data [14]. These results suggest that China’s economic complexity still has negative explanatorypower of regional income inequality, although China’s great economic expansion has risen regional disparities signif-icantly higher during the last a few decades [57, 58]. Once again, the results suggest the development of regions inChina is not homogenous (see GDP pc in Figure 3 and income inequality in Figure 4 for examples), even though thecountry as whole may experience remarkable increase in economic complexity and development. The observationsshould cause us to further explore the complexity and development at both national and regional levels. ﬀ erent Measures of Economic Diversity Thanks to the development of complexity sciences, a variety of metrics have been proposed to measure the di-versity of economies regarding their productive structures, including Economic Complexity Index (ECI) [18], FitnessIndex [39], Diversity [18] and Entropy [59]. The ECI deﬁnes a country’s complexity and a product’s ubiquity througha set of linear iterative equations. The Fitness Index deﬁnes a self-consistent metrics for a province’s ﬁtness and aproduct’s complexity through a set of no-linear iterative equations that assess the advantage of diversiﬁcation. TheDiversity is deﬁned by Eq. (2), i.e., the number of industries in which one province has the comparative advantage.7

005 2007 2009 2011 2013 20150.700.750.800.850.90 (cid:1)(cid:0)(cid:2)(cid:3)(cid:4)(cid:5)(cid:6)(cid:7)(cid:8)(cid:9)(cid:10)(cid:11)(cid:12)(cid:13)(cid:14)(cid:15)(cid:16)(cid:17)(cid:18)(cid:19)(cid:20)(cid:21)(cid:22)(cid:23)(cid:24)(cid:25)(cid:26)(cid:27)(cid:28)(cid:29)(cid:30) (cid:31) !"

ECIECI ECI >F?t ness r ( E C@ABiDEGH s ) Year

CBA

ECI

IJKL ness

MNOPQRSTUVWXYZ[\]^_‘abcdefghjkl m n o p q r su SHSNSCJSZJ v DBJLNHIHBHAHNSDHLXJC w AHJLFJHESX x XJX y ZYNXZ z SNMNX { HSHBJ |} SNHI ~ DZJJSHLC (cid:127)

XZHBLNFJ (cid:128)

XJLHNSCYNSDXJAHJXHAHE (cid:129)

ZNXNM (cid:130)

SSX (cid:131)

H0102030405060 (cid:132) (cid:133) (cid:134)

Figure 5: Comparison between Economic Complexity Index (ECI) and Fitness Index. ( A ) and ( B ) show the mappings of the rankings of provincesby ECI (left) into Fitness (right) in 2005 and 2015, respectively. ( C ) The Pearson’s correlation coe ﬃ cient between ECI and Fitness as a function oftime. All the positive correlations are signiﬁcant with p -value no more than 10 − . The abbreviations of province names correspond to Table A1 inAppendix. The Shannon Entropy measures the diversity of industries in which one province has the comparative advantage.First, we compare the ability of ECI and Fitness on ranking the complexity of China’s provinces. Figure 5A and5B present how the rankings by ECI is mapped into the rankings by Fitness in 2005 and 2015, respectively. In general,we ﬁnd that ECI and Fitness agree with each other for top rankings and bottom rankings, while the two methods aredistinguishable for middle rankings. For example, Hainan (HN) and Xinjiang (XJ) are respectively ranked 19 and22 by ECI in 2015, while the corresponding rankings are 8 and 9 by Fitness. There are also some provinces thatare overestimated by ECI compared to Fitness in 2015, such as Tianjin (TJ: 5 → → → r between ECI and Fitness as a function of time. We ﬁnd a positive and signiﬁcant correlationbetween the two rankings across all years with p -value no more than 10 − . The correlation is stabilized at about 0.871since 2011, suggesting that the rankings by ECI and Fitness are, to some extent, consistent and stable. The result isnotable since previous studies based on world trade data found inconsistency of ECI and Fitness methods in rankingcountries [20, 42]. Here, our empirical results based on ﬁrm data at regional level suggest that the two methods arecomparative. Considering that there is no ground truth in rankings in terms of economic complexity and the twomethods have distinctive intuitions and formulations, it is hard to identify the best methods in practices, leaving theproblem being still complicated. Indeed, the discrepancies of these four measures of economic diversity urge on thedevelopment of new regional economic complexity metrics.Next, we explore the correlations among di ﬀ erent measures of economic diversity, economic development andincome inequality. As shown in the ﬁrst four columns of Figure 6, all the four economic diversity metrics have positiveand signiﬁcant correlations with each other. Speciﬁcally, Fitness is highly correlated with ECI (see the second columnsof Figure 6), and Diversity is highly correlated with Entropy (see the fourth columns of Figure 6). ECI and Fitnesshave higher explanatory power for GDP pc compared to Diversity and Entropy, as suggested by their larger correlationcoe ﬃ cients, 0.665 for ECI and 0.662 for Fitness (see the ﬁfth column of Figure 6). Also, ECI and ﬁtness are betterindicators for relative income, compared to Diversity and Entropy (see the sixth and seventh columns of Figure 6).Together, we ﬁnd that the correlation coe ﬃ cients in RICR column are much larger than the corresponding values inRICU column, suggesting that the relative income in rural area are more explained by economic diversity metricsthan in urban area. For the income inequality, we ﬁnd that all the measures of economic diversity and economicdevelopment are negatively and signiﬁcantly correlated with RICD (see the last column of Figure 6), meaning thatthe more economic diversity and the higher level of economic development, the less income inequality. In particular,8 .872 (cid:135)(cid:136)(cid:137) (cid:138)(cid:139)(cid:140) (cid:141)(cid:142)(cid:143) (cid:144)(cid:145)(cid:146) (cid:147)(cid:148)(cid:149) (cid:150)(cid:151)(cid:152) RICRRICU (cid:153) DP (cid:154)(cid:155) En (cid:156) ro (cid:157)(cid:158) D (cid:159)(cid:160)¡ r ¢£⁄¥ƒ§¤ ness ' “«‹› ECI ﬁ ﬂ(cid:176)– †‡· (cid:181)¶• ‚„” »…‰ (cid:190)¿(cid:192) `´ˆ R I CD ˜¯˘ ˙¨ (cid:201)˚¸ (cid:204)˝˛ ˇ—(cid:209) (cid:210)(cid:211)(cid:212) (cid:213)(cid:214)(cid:215) R I CRR I CU (cid:216) D P (cid:217)(cid:218) E n (cid:219) r o (cid:220)(cid:221) D (cid:222)(cid:223)(cid:224) r Æ(cid:226)ª(cid:228)(cid:229)(cid:230)(cid:231) ne ss E C I -0.453 ŁØ -0.449 Œ -0.519 º(cid:236)(cid:237) -0.453 (cid:238)(cid:239) -0.684 (cid:240)æ(cid:242) -0.531 (cid:243)(cid:244)ı -0.812 (cid:246)(cid:247)ł E C I RICD

Figure 6: Correlations between di ﬀ erent economic diversity measures and economic development as well as income inequality. The economicdiversity measures include Economic Complexity Index (ECI), Fitness Index, Diversity and Shannon Entropy. The economic development measuresinclude GDP pc, relative income at urban area (RICU) and relative income at rural area (RICR). The income inequality is estimated by the relativeincome di ﬀ erences (RICD), deﬁned by the ratio of RICU to RICR. All metrics are averaged over the period 2010-2015 to reduce noises exceptfor RICU, RICR and RICD, which are only for 2010. The matrix diagonal shows the histograms of each variable, the upper triangle showsthe Pearson’s correlation coe ﬃ cients between the pair of variables, and the lower triangle shows the corresponding scatter-plots with solid linesrepresenting linear ﬁts. The correlation coe ﬃ cients are with signiﬁcant level * p < .

1, ** p < .

05, and *** p < . RICR has the highest explanatory power for RICD, followed by GDP pc and RICU, suggesting that the relativeincome in rural area has the potential to be the best negative indicator for income inequality in provinces. Thereason why economic complexity measures are less competitive than, for example, RICR, in correlating with RICDis still puzzling, which urges for further exploration. Moreover, we notice that ECI and Fitness are comparable witheach other in explaining economic development and income inequality as indicated by their very close correlationcoe ﬃ cients (see the ﬁrst two rows of Figure 6). Using bivariate statistics in the above two sections, we have shown the correlations between economic diversitymeasures and the level of economic development. In this section, based on multivariate regressions, we furtherexplore whether changes in a province’s economic diversity are associated with changes in the level of economicdevelopment after controlling the e ﬀ ects of other socioeconomic factors like Population, Urbanization, Schooling,Innovation and Trade. If the economic diversity is a good and robust indicator for economic development, we shouldobserve a positive and signiﬁcant correlation between the former non-monetary metrics (ECI and Fitness) and thelater monetary metric (GDP pc).Table 1 summarizes the results of multivariate regressions by using ordinary least squares (OLS) models withyear-ﬁxed e ﬀ ects for the period 2010-2015. The dependent variable is ln(GDP pc) and the independent variables ofinterest are ECI for columns (1)-(4) and Fitness for columns (5)-(8) of Table 1. We ﬁnd that ECI is a positive andsigniﬁcant indicator for the level of economic development, and it solely explains 49.33% of the variance in ln(GDPpc) among provinces (see column (1) of Table 1). Also, Urbanization shows positive and signiﬁcant relationship with9 able 1: Results of the multivariate regressions for the level of economic development.OLS model with dependent variable: ln(GDP pc)ECI Fitness(1) (2) (3) (4) (5) (6) (7) (8)ECI / Fitness 0.2736*** 0.1699*** 0.1039*** 0.1398*** 0.2746*** 0.1707*** 0.1270*** 0.1445***(0.0237) (0.0302) (0.0304) (0.0307) (0.0235) (0.0390) (0.0297) (0.0298)ln(Population) 0.0148 0.0014(0.0269) (0.0296)Urbanization 0.7314*** 0.6349***(0.1386) (0.1767)Schooling 21.488*** 22.165***(3.0816) (2.8787)ln(Innovation) 0.0458* 0.0306(0.0185) (0.0192)ln(Trade) 0.1107*** 0.1101***(0.0180) (0.0175)Observations 186 186 186 186 186 186 186 186Adjusted R Notes : These multivariate regressions use the ordinary least squares (OLS) models to regress the level of economic development (GDP pc) againstEconomic Complexity Index (ECI) in columns (1)-(4) and Fitness Index in columns (5)-(8). These regressions include year-ﬁxed e ﬀ ects usingdata in the 2010-2015 period. Regression coe ﬃ cients of variables with standard errors (in the corresponding parentheses) are reported undersigniﬁcant level * p < .

1, ** p < .

05, and *** p < .

01. Adjusted R indicates how many data points fall within the line of the regressionequation, and RMSE stands for the root mean square error. ln(GDP pc). The factor ln(Population) has positive correlation with ln(GDP pc), but the result is not signiﬁcant (seecolumn (2) of Table 1). These three factors can explain 55.84% of the variance in ln(GDP pc).Economic research has revealed the importance of education, which raises people’s knowledge, skills, productiv-ity and creativity, as a crucial and fundamental factor in economic development [60, 61, 62]. Here, we ﬁnd that bothSchooling (the ratio of students in higher education in a province) and Innovation (the number of domestic grantedpatents) are positively and signiﬁcantly correlated with ln(GDP pc), as shown in column (3) of Table 1. The explana-tory power of ECI remains positive and signiﬁcant after controlling the e ﬀ ects of Schooling and Innovation. The threefactors together explain up to 62.77% of the variance in ln(GDP pc). In column (4) of Table 1, we ﬁnd the positiveand signiﬁcant correlation between ln(GDP pc) and Trade (the total value of imports and exports of foreign trade).ECI and ln(Trade) can explain 57.94% of the variance in ln(GDP pc).Columns (5)-(8) of Table 1 present regression results using Fitness, where we ﬁnd Fitness alone has the closeexplanatory power as ECI (see column (1) and column (5)). However, one should notice that Fitness and ECI aredistinguishable from each other, for example, their formulas have essential di ﬀ erences (see Eq. (4) and Eq. (6)), thedistributions of the values that they produce are di ﬀ erent (see bar plots in the upper-left of Figure 6), and the correlationbetween their values is 0.872 instead of 1 (see scatter plot and correlation value in the upper-left of Figure 6). Indeed,after controlling the e ﬀ ects of Population and Urbanization, the explanatory power of Fitness is slightly inferior to ECI,as indicated by the smaller values of Adjusted- R (see column (6) of Table 1). Also, Fitness becomes more powerfulthan ECI, after controlling the e ﬀ ects of Schooling, Innovation and Trade (see columns (7) and (8) of Table 1).Moreover, we notice that ln(Innovation) loses its explanatory power for ln(GDP pc) in the Fitness regression, asshown in column (7) of Table 1. In short, ECI and Fitness are comparative with each other, and both of them arerobust in explaining regional economic development.

4. Conclusions and Discussion

In this paper, we studied China’s regional economic complexity based on 25 years’ ﬁrm data covering 31 provincesand 70 industries. First, we mapped the ﬁrm data to a “Province-Industry” bipartite network, based on which we foundthat provinces with a high level of economic diversity trend to have the comparative advantages in industries with alow level of ubiquity. Then, we quantiﬁed the competitiveness of provinces through the non-monetary EconomicComplexity Index (ECI) by deﬁning a set of linear iterative equations between provinces’ economic complexity and10ndustries’ ubiquity. We found that provinces located around the coast have larger ECIs, and the overall time evolutionof provinces’ rankings by ECI are relatively stable and slow. Further, after linking ECI with the economic develop-ment, as measured by GDP pc and the relative income at urban (RICU) and rural areas (RICR), and the relative incomedi ﬀ erences (RICD), we found that ECI is positively and signiﬁcantly correlated with the level of economic develop-ment while negatively correlated with income inequality, suggesting that ECI has potential to be a good non-monetaryindicator for revealing the status of regional economic development.Moreover, we compared di ﬀ erent measures of non-monetary economic complexity and diversity themselves (ECI,Fitness, Diversity and Entropy), and explored their relationships with some traditional monetary macroeconomic in-dicators (GDP pc, RICU, RICR and RICD). We found that both ECI and Fitness have higher and positive correlationswith the level of economic development, compared to Diversity and Entropy. Together, we found the relative incomein rural area (RICR) outperforms the relative income in urban area (RICU) in correlating with the economic diversitymeasures. Moreover, we showed that all the measures of economic diversity and economic development are nega-tively and signiﬁcantly correlated with the income inequality (RICD), suggesting that provinces with higher economicdiversity and relative income have less income inequality. Finally, we checked the robustness of the explanatory powerof ECI and Fitness for economic development using multivariate regressions with controlling for the e ﬀ ects of somesocioeconomic factors like Population, Urbanization, Schooling, Innovation and Trade. Results suggest that both ECIand Fitness are robust in correlating with regional economic development. Even though the causal relation betweeneconomic complexity and development cannot be established yet, our work still contributes to the literature on thecomplexity of regional economic systems within a nation.Nevertheless, our results are not beyond limitations on data and modeling. The ﬁrm data contain a tiny fractionof all Chinese ﬁrms. In fact, some very successfully and representative ﬁrms are not included in our analysis justbecause they are not listed in the two major stock markets. Also, our data are limited by the spatial resolutionbecause provinces of China are ﬁrst-level administrative divisions with heterogeneous land area. Some provinceshave large area but small population like Inner Mongolia, while some may be opposite like Jiangsu. Moreover, the“Province-Industry” network is built by counting how many ﬁrms in one province that operate in an industry withoutconsidering the revenues and sizes of ﬁrms. This may cause potential biases, to some extent, towards small ﬁrmssince they have less economic capacity, yet small contribution to regional economy development. In addition, theemergence of new industries is limited by whether they have the comparative advantage, which is not intuitive thanby the absolute number of ﬁrms. Furthermore, our current analysis is unable to establish causal relation betweeneconomic complexity and development, limiting prediction as correlation in this context. Besides, due to the lackof o ﬃ cially reported panel data of Gini coe ﬃ cients for Chinese provinces, the relative income di ﬀ erences in urbanand rural areas in 2010 is used as an alternative in estimating income inequality [50], which limits the time evolutionanalysis and the comparison with other literature that used Gini coe ﬃ cients. These aforementioned limitations call forimprovements towards better understanding the status of regional economic development of China during its periodof economic expansion.How to better quantify economic complexity in both theoretical and empirical ways is still an open question, whichremains further investigation. For example, as pointed out by previous studies [20, 21, 39], the two main economiccomplexity indicators sometimes don’t show consistency with each other in ranking countries based on world tradedata, and traditional regression analysis is not particularly meaningful for addressing the economic complexity prob-lem. However, due to the lack of ground truth and the dependency on dataset in empirical studies, arguments on whichindicator performs best and which branch of theories is the most suitable to address this problem will not see theirends and now urge on quantitative evaluation methods [42]. Nevertheless, in recent years this branch of economiccomplexity studies have found widely applications in ranking countries, industries, institutions, occupations and prod-ucts [14, 19, 21], see for example the Observatory of Economic Complexity (OEC) (http: // atlas.media.mit.edu), theDataViva (http: // // ﬀ erent scales challenges the practicability of these methods, for example, the consistency or inconsistencyof ECI and Fitness results at national level and regional level. Keeping these potential limitations and promising realworld applications in mind, we would leave seeking data covering more ﬁrms with higher spatial resolution, checkingthe robustness of ﬁndings using alternative deﬁnitions of new industry presence, exploring new methods to evaluatethe performance of economic complexity indicators and proposing novel economic complexity metrics at di ﬀ erentscales as future works.Indeed, the increasing complexity of economic systems and the data revolution of the past decade urge us on11 paradigm change in a more complexity-oriented and data-oriented economic thinking [63, 64, 65]. For example,mainstream approaches measure economic development and predict economic growth using the aggregated GDPbased on economic census, ﬁnancial market, foreign investment, physical capital, and so on [66, 67, 68]. However,computing monetary factors, for instance GDP, is usually a non-trivial task due to their involvement with considerableresources for a long period [3]. In recent years, as the availability of large-scale data [4, 6] and the development ofcomplexity [19, 69] and network science [70, 71, 72], new conceptual frameworks have been developed to addressthese issues in a more e ﬃ cient way with far less cost. For example, based on world trade data, “Product Space”was proposed which reveals the status of national economic development and explains why not all countries face thesame opportunities in future development [17], and non-monetary economic complexity and ﬁtness were introducedwhich have potential to predict future growth [18]. Moreover, online social networks [3, 16, 73], mobile phonedata [1, 11, 74], satellite imagery [12, 75], geo-tagged images [13] and web queries [76, 77] have also been appliedto reveal economic status, infer economic development, forecast unemployment, predict poverty, map inequality,quantify trading behavior [10] and correlate stock market moves [78]. Although the new way of economic thinking isnot perfect [20, 21] and somehow limited by the availability of data and new statistical tools, there is a high possibilitythat it will change the landscape of economic research in the near future [6]. Acknowledgments

The authors acknowledge the anonymous reviewers for critical comments and constructive suggestions. Theauthors thank Haixing Dai, Yiding Liu, Zhihai Rong, Qing Wang, and Dan Yang for helpful discussions. This workwas partially supported by the National Natural Science Foundation of China (Grant Nos. 61433014 and 61673086).Jian Gao acknowledges the China Scholarship Council for partial ﬁnancial support and the Collective Learning groupat the MIT Media Lab for hosting.

Appendix A.

Table A1: The two-digital abbreviations of province names in China.ID Abbreviation Province ID Abbreviation Province ID Abbreviation Province1 BJ Beijing 12 AH Anhui 23 SC Sichuan2 TJ Tianjin 13 FJ Fujian 24 GZ Guizhou3 HE Hebei 14 JX Jiangxi 25 YN Yunnan4 SX Shanxi 15 SD Shandong 26 XZ Tibet5 NM Inner Mongolia 16 HA Henan 27 SN Shaanxi6 LN Liaoning 17 HB Hubei 28 GS Gansu7 JL Jilin 18 HN Hunan 29 QH Qinghai8 HL Heilongjiang 19 GD Guangdong 30 NX Ningxia9 SH Shanghai 20 GX Guangxi 31 XJ Xinjiang10 JS Jiangsu 21 HI Hainan11 ZJ Zhejiang 22 CQ Chongqing

References [1] N. Eagle, M. Macy, R. Claxton, Network diversity and economic development, Science 328 (5981) (2010) 1029–1031.[2] J. Gao, T. Zhou, Big data reveal the status of economic development, Journal of University of Electronic Science and Technology of China45 (4) (2016) 625–633.[3] J.-H. Liu, J. Wang, J. Shao, T. Zhou, Online social activity reﬂects economic status, Physica A 457 (2016) 581–589.[4] L. Einav, J. Levin, The data revolution and economic analysis, Innovation Policy and the Economy 14 (1) (2014) 1–24.[5] D. S. Hamermesh, Six decades of top economics publishing: Who and how?, Journal of Economic Literature 51 (1) (2013) 162–172.[6] L. Einav, J. Levin, Economics in the age of big data, Science 346 (6210) (2014) 1243089.[7] C. A. Hidalgo, Disconnected, fragmented, or united? A trans-disciplinary review of network science, Applied Network Science 1 (1) (2016)6.[8] C. A. Hidalgo, R. Hausmann, A network view of economic development, Developing Alternatives 12 (1) (2008) 5–10.

9] J. Gao, B. Jun, A. Pentland, T. Zhou, C. A. Hidalgo, Collective learning in China’s regional economic development, arXiv:1703.01369.[10] T. Preis, H. S. Moat, H. E. Stanley, Quantifying trading behavior in ﬁnancial markets using Google Trends. Scientiﬁc Reports, 3 (2013)01684.[11] J. Blumenstock, G. Cadamuro, R. On, Predicting poverty and wealth from mobile phone metadata, Science 350 (6264) (2015) 1073–1076.[12] N. Jean, M. Burke, M. Xie, W. M. Davis, D. B. Lobell, S. Ermon, Combining satellite imagery and machine learning to predict poverty,Science 353 (6301) (2016) 790–794.[13] P. Salesses, K. Schechtner, C. A. Hidalgo, The collaborative image of the city: Mapping the inequality of urban perception, PLoS ONE 8 (7)(2013) e68400.[14] D. Hartmann, M. R. Guevara, C. Jara-Figueroa, M. Aristar´an, C. A. Hidalgo, Linking economic complexity, institutions and income inequal-ity, World Development 93 (2017) 75–93.[15] A. Llorente, M. Garcia-Herranz, M. Cebrian, E. Moro, Social media ﬁngerprints of unemployment, PLoS ONE 10 (5) (2015) e0128692.[16] J. Yuan, Q.-M. Zhang, J. Gao, L. Zhang, X.-S. Wan, X.-J. Yu, T. Zhou, Promotion and resignation in employee networks, Physica A 444(2016) 442–447.[17] C. A. Hidalgo, B. Klinger, A.-L. Barab´asi, R. Hausmann, The product space conditions the development of nations, Science 317 (5837)(2007) 482–487.[18] C. A. Hidalgo, R. Hausmann, The building blocks of economic complexity, Proceedings of the National Academy of Sciences, USA 106 (26)(2009) 10570–10575.[19] R. Hausmann, C. A. Hidalgo, S. Bustos, M. Coscia, A. Simoes, M. A. Yildirim, The atlas of economic complexity: Mapping paths toprosperity, MIT Press, Cambridge, MA, USA, 2014.[20] M. Cristelli, A. Gabrielli, A. Tacchella, G. Caldarelli, L. Pietronero, Measuring the intangibles: A metrics for the economic complexity ofcountries and products, PLoS ONE 8 (8) (2013) e70726.[21] M. Cristelli, A. Tacchella, L. Pietronero, The heterogeneous dynamics of economic complexity, PLoS ONE 10 (2) (2015) e0117174.[22] V. Plerou, P. Gopikrishnan, H. E. Stanley, Econophysics: Two-phase behaviour of ﬁnancial markets. Nature, 421 (6919) (2003) 130–130.[23] T. Preis, S. Golke, W. Paul, J. J. Schneider. Multi-agent-based order book model of ﬁnancial markets. EPL (Europhysics Letters), 75 (3)(2006) 510.[24] H. E. Stanley, et al. Anomalous ﬂuctuations in the dynamics of complex systems: from DNA and physiology to econophysics. Physica A224 (1-2) (1996) 302–321.[25] R. N. Mantegna, H. E. Stanley, Introduction to Econophysics: Correlations and Complexity in Finance. Cambridge University Press, NewYork, USA, 1999.[26] J. P. Bouchaud, An introduction to statistical ﬁnance. Physica A, 313 (1) (2002) 238–251.[27] S. Battiston, M. Puliga, R. Kaushik, P. Tasca, G. Caldarelli, Debtrank: Too central to fail? Financial networks, the fed and systemic risk.Scientiﬁc Reports, 2 (2012) 00541.[28] T. Preis, Econophysics-complex correlations and trend switchings in ﬁnancial time series. European Physical Journal-Special Topics, 194 (1)(2011) 5-86.[29] B. Jun, A. Alshamsi, J. Gao, C. A. Hidalgo, Relatedness, knowledge di ﬀ usion, and the evolution of bilateral trade. arXiv:1709.05392.[30] J. Gao, Y.-W. Dong, M.-S. Shang, S.-M. Cai, T. Zhou, Group-based ranking method for online rating systems with spamming attacks. EPL(Europhysics Letters), 110 (2) (2015) 28003.[31] J. Gao, T. Zhou, Evaluating user reputation in online rating systems via an iterative group-based ranking method. Physica A, 473 (2017)546–560.[32] J. B. Schafer, J. Konstan J. Riedl, Recommender systems in e-commerce. in: Proceedings of the 1st ACM Conference on Electronic Com-merce, ACM, New York, NY, USA, pp. 158–166.[33] L.-J. Chen, Z.-K. Zhang, J.-H. Liu, J. Gao, T. Zhou. A vertex similarity index for better personalized recommendation. Physica A 466 (2017)607–615.[34] V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. Amaral, H. E. Stanley, Econophysics: Financial time series from a statistical physics point ofview. Physica A, 279 (1) (2000) 443–456.[35] A. Chakraborti, I. M. Toke, M. Patriarca, F. Abergel, Econophysics review: I. Empirical facts. Quantitative Finance, 11 (7) (2011) 991–1012.[36] J.-P. Huang. Experimental econophysics: Complexity, self-organization, and emergent properties. Physics Reports, 564 (2015) 1–55.[37] B. K. Chakrabarti, A. Chakraborti, A. Chatterjee, Econophysics and Sociophysics: Trends and Perspectives. John Wiley & Sons, Malden,USA, 2007.[38] S. Sinha, A. Chatterjee, A. Chakraborti, B. K. Chakrabarti, Econophysics: An Introduction. John Wiley & Sons, Malden, USA, 2010.[39] A. Tacchella, M. Cristelli, G. Caldarelli, A. Gabrielli, L. Pietronero, A new metrics for countries’ ﬁtness and products’ complexity, ScientiﬁcReports 2 (2012) 723.[40] G. Caldarelli, M. Cristelli, A. Gabrielli, L. Pietronero, A. Scala, A. Tacchella, A network analysis of countries’ export ﬂows: Firm groundsfor the building blocks of the economy, PLoS ONE 7 (10) (2012) e47278.[41] R.-J. Wu, G.-Y. Shi, Y.-C. Zhang, M. S. Mariani, The mathematics of non-linear metrics for nested networks, Physica A 460 (2016) 254–269.[42] M. S. Mariani, A. Vidmer, M. Medo, Y.-C. Zhang, Measuring economic complexity of countries and products: Which metric to use?,European Physical Journal B 88 (11) (2015) 293.[43] R. Hausmann, C. A. Hidalgo, The network structure of economic output, Journal of Economic Growth 16 (4) (2011) 309–342.[44] J. Felipe, U. Kumar, A. Abdon, M. Bacate, Product complexity and economic development, Structural Change and Economic Dynamics23 (1) (2012) 36–68.[45] A. Zaccaria, M. Cristelli, R. Kupers, A. Tacchella, L. Pietronero, A case study for a new metrics for economic complexity: The Netherlands,Journal of Economic Interaction and Coordination 11 (1) (2016) 151–169.[46] S. Zhu, R. Li, Economic complexity, human capital and economic growth: Empirical research based on cross-country panel data, AppliedEconomics 49 (38) (2017) 3815-3828.[47] V. Stojkoski, Z. Utkovski, L. Kocarev, The impact of services on economic complexity: Service sophistication as route for economic growth, LoS ONE 11 (8) (2016) e0161633.[48] Z. Song, K. Storesletten, F. Zilibotti, Growing like China, American Economic Review 101 (1) (2011) 196–233.[49] D. S. Goodman, China’s regional development, Routledge, London, UK, 2013.[50] Y. Xie, X. Zhou, Income inequality in today’s China, Proceedings of the National Academy of Sciences, USA 111 (19) (2014) 6928–6933.[51] B. Balassa, Trade liberalisation and “revealed” comparative advantage, Manchester School 33 (2) (1965) 99–123.[52] D. Bahar, R. Hausmann, C. A. Hidalgo, Neighbors and the evolution of the comparative advantage of nations: Evidence of internationalknowledge di ﬀ usion?, Journal of International Economics 92 (1) (2014) 111–123.[53] Y.-B. Zhou, T. Lei, T. Zhou, A robust ranking algorithm to spamming, EPL (Europhysics Letters) 94 (4) (2011) 48002.[54] S. Kuznets, Economic growth and income inequality, American Economic Review 45 (1) (1955) 1–28.[55] K. Deininger, L. Squire, A new data set measuring income inequality, World Bank Economic Review 10 (3) (1996) 565–591.[56] H. Li, L. Squire, H.-F. Zou, Explaining international and intertemporal variations in income inequality, Economic Journal 108 (446) (1998)26–43.[57] G. Wan, M. Lu, Z. Chen, Globalization and regional income inequality: Empirical evidence from within China, Review of Income and Wealth53 (1) (2007) 35–59.[58] S. Li, H. Sato, T. Sicular, Rising inequality in China: Challenges to a harmonious society, Cambridge University Press, New York, USA,2013.[59] C. E. Shannon, A mathematical theory of communication, Bell System Technical Journal 27 (3) (1948) 379–423.[60] R. R. Nelson, E. S. Phelps, Investment in humans, technological di ﬀ usion, and economic growth, American Economic Review 56 (1 /

2) (1966)69–75.[61] G. Cainelli, R. Evangelista, M. Savona, Innovation and economic performance in services: A ﬁrm-level analysis, Cambridge Journal ofEconomics 30 (3) (2006) 435–458.[62] D. H. Autor, Skills, education, and the rise of earnings inequality among the “other 99 percent”, Science 344 (6186) (2014) 843–851.[63] R. Martin, P. Sunley, Complexity thinking and evolutionary economic geography, Journal of Economic Geography 7 (5) (2007) 573–601.[64] S. N. Durlauf, Complexity and empirical economics, Economic Journal 115 (504) (2005) F225–F243.[65] M. Cristelli, Complexity in ﬁnancial markets: Modeling psychological behavior in agent-based models and order book models, SpringerScience + Business Media LLC, New York, USA, 2013.[66] R. E. Lucas, On the mechanics of economic development, Journal of Monetary Economics 22 (1) (1988) 3–42.[67] N. G. Mankiw, D. Romer, D. N. Weil, A contribution to the empirics of economic growth, Quarterly Journal of Economics 107 (2) (1992)407–437.[68] R. Levine, S. Zervos, Stock markets, banks, and economic growth, American Economic Review 88 (3) (1998) 537–558.[69] C. A. Hidalgo, Why information grows: The evolution of order, from atoms to economies, Basic Books, New York, USA, 2015.[70] M. E. J. Newman, Networks: An Introduction. Oxford University Press, Oxford, UK, 2010.[71] T. G. Lewis, Network Science: Theory and Applications. John Wiley & Sons, Malden, USA, 2011.[72] A.-L. Barab´asi, Network Science, Cambridge University Press, New York, USA, 2016.[73] J. Gao, L. Zhang, Q.-M. Zhang, T. Zhou, Big data human resources: Performance analysis and promotion / resignation in employee networks,in: Social Physics: Social Governance, Science Press, Beijing, China, 2014, Ch. 4, pp. 38–56.[74] S. ˇS´cepanovi´c, I. Mishkovski, P. Hui, J. K. Nurminen, A. Yl¨a-J¨a¨aski, Mobile phone call data as a regional socio-economic proxy indicator,PLoS ONE 10 (4) (2015) e0124160.[75] C. N. Doll, J.-P. Muller, J. G. Morley, Mapping regional economic activity from night-time light satellite imagery, Ecological Economics57 (1) (2006) 75–92.[76] T. Preis, D. Reith, H. E. Stanley, Complex dynamics of our economic life on di ﬀ erent scales: Insights from search engine query data.Philosophical Transactions of the Royal Society of London A, 368 (1933) (2010) 5707–5719.[77] H. Choi, H. Varian, Predicting the present with Google Trends, Economic Record 88 (s1) (2012) 2–9.[78] C. Curme, T. Preis, H. E. Stanley, H. S. Moat, Quantifying the semantics of search behavior before stock market moves. Proceedings of theNational Academy of Sciences, USA 111 (32) (2014) 11600–11605.erent scales: Insights from search engine query data.Philosophical Transactions of the Royal Society of London A, 368 (1933) (2010) 5707–5719.[77] H. Choi, H. Varian, Predicting the present with Google Trends, Economic Record 88 (s1) (2012) 2–9.[78] C. Curme, T. Preis, H. E. Stanley, H. S. Moat, Quantifying the semantics of search behavior before stock market moves. Proceedings of theNational Academy of Sciences, USA 111 (32) (2014) 11600–11605.