[PDF] Regional and Sectoral Structures and Their Dynamics of Chinese Economy: A Network Perspective from Multi-Regional Input-Output Tables

Abstract

A multi-regional input-output table (MRIOT) containing the transactions among the region-sectors in an economy defines a weighted and directed network. Using network analysis tools, we analyze the regional and sectoral structure of the Chinese economy and their temporal dynamics from 2007 to 2012 via the MRIOTs of China. Global analyses are done with network topology measures. Growth-driving province-sector clusters are identified with community detection methods. Influential province-sectors are ranked by weighted PageRank scores. The results revealed a few interesting and telling insights. The level of inter-province-sector activities increased with the rapid growth of the national economy, but not as fast as that of intra-province economic activities. Regional community structures were deeply associated with geographical factors. The community heterogeneity across the regions was high and the regional fragmentation increased during the study period. Quantified metrics assessing the relative importance of the province-sectors in the national economy echo the national and regional economic development policies to a certain extent.

Full PDF

RRegional and Sectoral Structures and Their Dynamics ofChinese Economy: A Network Perspective from

Multi-Regional Input-Output Tables

Tao Wang , Shiying Xiao , Jun Yan , and Panpan Zhang ∗ School of Statistics, Shanxi University of Finance and Economics, Taiyuan030006, China Department of Statistics, University of Connecticut, Storrs, CT 06269,USA Department of Biostatistics, Epidemiology and Informatics, University ofPennsylvania, Philadelphia, PA 19104, USA ∗ Corresponding author. Email: [email protected] 25, 2021 a r X i v : . [ phy s i c s . s o c - ph ] F e b bstract A multi-regional input-output table (MRIOT) containing the transactions among the region-sectors in an economy deﬁnes a weighted and directed network. Using network analysistools, we analyze the regional and sectoral structure of the Chinese economy and theirtemporal dynamics from 2007 to 2012 via the MRIOTs of China. Global analyses aredone with network topology measures. Growth-driving province-sector clusters are identiﬁedwith community detection methods. Inﬂuential province-sectors are ranked by weightedPageRank scores. The results revealed a few interesting and telling insights. The level ofinter-province-sector activities increased with the rapid growth of the national economy, butnot as fast as that of intra-province economic activities. Regional community structures weredeeply associated with geographical factors. The community heterogeneity across the regionswas high and the regional fragmentation increased during the study period. Quantiﬁedmetrics assessing the relative importance of the province-sectors in the national economyecho the national and regional economic development policies to a certain extent.

Keywords : backbone structure, dynamical analysis, input-output table, network analysis,region-sector economy

Introduction

The rapid growth of the Chinese economy over the last three decades has drastically elevatedits importance in the global economy. The annual growth rate of gross domestic product(GDP) during 1990–2010 was 10.4% (International Monetory Fund, 2020). The growth inChina was a driving force for the recovery of the world from the ﬁnancial crisis in 2008 (Lin,2011). As the Chinese economy matured, the growth slowed down to 6.74% over 2015–2019,but it was still much higher than that of the world economy, 2.82%, during the same period(The World Bank, 2020c). In 2019, China contributed 16.34% to the global GDP, secondonly to the United States and almost tripling the contribution by Japan which ranked thethird (The World Bank, 2020b). China has become an integrated part of the global economy.As a top trader, China accounted for 10.14% of the global imports (The World Bank, 2020d)and 10.61% of the global exports in 2019 (The World Bank, 2020a). Behind the growth, therehave been dramatic structural changes such as urbanization and industrialization (Fan et al.,2003; Chen et al., 2011). The regional and sectoral structures of the Chinese economy areheavily aﬀected by internal government’s policies such as the Great Western DevelopmentStrategy (Jia et al., 2020) or external factors such as the World Trade Organization accessionin 2001 (Chow, 2003) and the 2008 global ﬁnancial crisis (Yuan et al., 2010).Given its size and impact, the Chinese economy is central to important regional andsectoral structure issues in economic development theory and practice. The disparities insectoral structure and economic growth at the province level in China are high and havebeen increasing (e.g., Fan et al., 2011; Li and Haynes, 2011; Lee et al., 2012). Liberalizedand globalized industries are mostly aggregated in the coastal regions while low technology,resource-based, and protected industries are widely dispersed in the inland regions (e.g., Heand Wang, 2012). Emerging industries are more likely to enter the regions that are glob-alized, economically liberalized and ﬁscally healthy (He et al., 2018). On one hand, thedistribution of value-added across regions has been ﬂattened due to the expansion of interre-gional trade (Meng et al., 2017). On the other hand, growing inter-regional competition and1ocal protection have jointly led to various inter-regional trade barriers and severe regionalfragmentation (Young, 2000; Poncet, 2005). Understanding the regional and sectoral struc-tures of the Chinese economy is critical for economic development and resource eﬃciencynot only in China but also the entire world.Multi-regional input-output tables (MRIOTs) are the most prevalent tool for studyingthe inter-dependencies among the sectors from diﬀerent regions in an economy. An MRIOTrecords the transactions among the sectors within multiple regions (e.g., Moses, 1955; Leon-tief and Strout, 1963). A few MRIOTs at the global level have been available with bilateraltrade information for a large number of countries annually for several decades (Tukker andDietzenbacher, 2013). MRIOTs at the province level within China, however, are not readilyavailable. Based on the survey-based input-output tables at the province level released bythe government, Chinese MRIOTs have been compiled for 2007 and 2012 (Liu et al., 2012,2018). The tables and their extensions have been used in the analyses of the impact of gov-ernment infrastructural plans (Ji et al., 2019), provincial and sector-level material footprints(Liang et al., 2017; Jiang et al., 2019), and cross-country sectoral price comparison (Fujikawaand Milana, 2002), among others.An MRIOT inherently deﬁnes a weighted directed network which facilitates analyses withstatistical methods for networks. The world input-output tables (WIOTs) (Timmer et al.,2015) are MRIOTs at the global level. Traditional tools for input-output table analysisinclude multipliers, linkages, and structural paths (Defourny and Thorbecke, 1984; Feserand Bergman, 2000), which measure the impact from each region-sector in the table. Incontrast, network analysis tools enable not only the measures at the region-sector level suchas centrality, but also the natural investigation on local clustering, community detection, andbackbone extraction as well as global network features such as assortativity and clusteringcoeﬃcient (Leonidov and Serebryannikova, 2019; Xu and Liang, 2019). With a sequenceof input-output tables over diﬀerent years, the dynamic changes of network features can beinvestigated, which are of great value in structural and regional analyses (Cerina et al., 2015;2el R´ıo-Chanona et al., 2017; Amador and Cabral, 2017). For the Chinese MRIOTs, in partdue to their limited availability, no network analysis has been done to study the regional andsectoral structure of the Chinese economy.Our contributions are two-fold. First, to the best of our knowledge, this is the ﬁrst com-prehensive network analysis of the MRIOTs of China to study regional and sectoral structureof the Chinese economy. Through the analysis, we have evidently observed a clear patternof increased regional fragmentation from 2007 to 2012. Some of the research outcomes havenot been reported in the literature, and may not be straightforwardly uncovered via tra-ditional input-output analysis tools. Our second contribution is the application of severalnovel network measures speciﬁcally developed for weighted and directed networks to analyzethe Chinese MRIOTs. The MRIOTs lead to networks that are both weighted and directed.If weight and/or direction are disregarded like in some existing analyses, the features ofthe networks such as assortativity and centrality are not precisely summarized. The newnetwork measures help correct the misleading results and inaccurate inference driven fromthe classical unweighted versions.The rest of the manuscript is organized as follows. In Section 2, we give a brief introduc-tion of the compiled MRIOTs in China, and demonstrate the network analysis setup. Thespeciﬁc network analysis methods are presented in Section 3, followed by the applicationsto the MRIOTs in Section 4. Finally, we address some concluding remarks and follow-updiscussions in Section 5.

The MRIOTs of China are available for the year of 2007 (Liu et al., 2012), 2010 (Liu et al.,2014), and 2012 (Liu et al., 2018). The databases were jointly developed by the Instituteof Geographic Sciences and the Natural Resources Research of the Chinese Academy ofSciences, and the National Bureau of Statistics of China. The entries of inter-province-3able 1: Fundamental tructure of an MRIOT.

Intermediate use Finaluse Totaloutputregion 01 · · · region 30sector 01 · · · sector 30 · · · sector 01 · · · sector 30 I n t e r m e d i a t e i npu t r e g i o n sector 01 I II...sector 30... ... r e g i o n sector 01...sector 30Imports IIIValue added IVTotal input sector economic transactions were obtained by applying the gravity model (Bergstrand,1985; Sargento, 2007) to the input-output tables reported by all the participating provinces.Table 1 shows the fundamental structure of MRIOT. We focus on the 2007 and 2012 tablesin the present study because the compilation of the 2010 table was based on the 2007 tablein addition to the input-output tables of the 17 provinces rather than direct data collectionand investigation (Mi et al., 2018), which might cause measurement errors and bias. AnMRIOT consists of four parts: (I) intermediate ﬂow matrix, (II) ﬁnal use, (III) imports, and(IV) value added. The intermediate ﬂow matrix records the economic exchanges among thesectors from diﬀerent provinces, reﬂecting their intricate economic relations (e.g., supply anddemand) as well as their interdependence and mutual constraints.The data in the MRIOTs were pre-processed to prepare for the analyses. The 2007 tablecovered 30 provincial units with each containing 30 sectors. The 2012 table, however, covered31 provincial units due to the debut of Tibet and 42 sectors that were further divided fromthe 30 sectors in the 2007 table. For the purpose of comparison over time, we only includedthe 30 provinces that appeared in both tables and aggregated the 42 sectors in 2012 to the4able 2: Description of the sectors in the MRIOTs Code Sector Code Sector01 Agriculture, forestry, animal husbandryand ﬁshery 16 General and specialist machinery02 Coal mining and processing 17 Transport equipment03 Petroleum and gas extracting 18 Electrical equipment04 Metals mining/processing 19 Electronic equipment05 Nonmetal mining/processing 20 Instrument and meter06 Food processing and tobaccos 21 Other manufacturing07 Textiles 22 Electricity and heat production and sup-ply08 Clothing, leather, fur, etc. 23 Gas and water production and supply09 Wood processing and furnishing 24 Construction10 Paper making, printing, stationery, etc. 25 Transport and storage11 Petroleum reﬁning, coking, etc. 26 Wholesale and retail12 Chemical industry 27 Hotel and restaurant13 Nonmetal products 28 Leasing and commercial services14 Metallurgy 29 Scientiﬁc research15 Metal products 30 Other services

30 sectors in 2007. Table 2 lists the codes with detailed descriptions for the 30 sectors.The monetary units for both tables were set to be 10,000 Chinese Yuan (CNY). To adjustfor inﬂation, we converted the entries in the 2012 table to 2007 CNY using the GDP pricedeﬂator (The World Bank, 2020e).We constructed the multi-regional input-output networks (MRIONs) based on the MRI-OTs. In an MRION, each vertex represents a sector within a province; each directed edgerepresents the existence of transaction from the source province-sector to the target province-sector, with weight representing the multiplier of the transaction 10,000 CNY. Therefore,the MRIONs are weighted and directed. The number of vertices in each MRION is 900. Thelink densities are respectively 0.7685 in 2007 and 0.9212 in 2012, suggesting that the verticesin the MRIONs are densely connected. The top two panels of Figure 1 show chord visual-izations of the MRIONs aggregated according to sectors for 2007 and 2012 with self-loopsremoved. Each of the outer arcs with a distinct color represents a sector, with arc lengthrepresenting the sum of the inﬂows and outﬂows. A chord from one arc to another represents5 inter−sectoral flow in 2007 inter−sectoral flow in 2012 inter−provincial flow in 2007 inter−provincial flow in 2012 Figure 1: Intermediate ﬂows (across multiple regions) in the MRIONs from 2007 and 2012.The top two panels are for sectors (aggregated with respect to provinces), where the codesare referred to Table 2. The bottom two panels are for provinces (aggregated with respect tosectors), where the codes are referred to Appendix A. The bandwidth of chord (connectingthe arcs) is proportional to the size of economic ﬂow. Long arcs indicate large outputs. Theunit of economic ﬂow is 100 billion CNY.the transaction from the corresponding sector to the other. Its width is proportional to thevolume of the transaction, while its color remains the same as the color of the source sector.For both years, the main suppliers are “metallurgy” (14), “chemical industry” (12), “other6ervices” (30), and “agriculture, forestry, animal husbandry and ﬁshery” (01). They supplya large portion of the intermediate products or services that are needed by other sectors.The most notable receivers are “construction” (24) and “other services” (30). These twosectors may have strong pulling eﬀects on the whole economy. Especially in “construction”(24), the proportion of inﬂows in its inter-sectoral transaction exceeds 90%. One notablechange from 2007 to 2012 is the share of the transactions associated with sector “scientiﬁcresearch” (29), which is quadrupled from 0.24% to 0.97%.In addition, we provide the chord visualizations of the MRIONs aggregated by provinces(with self-loops removed as well) for 2007 and 2012, shown in the bottom two panels inFigure 1. The arcs and chords are deﬁned analogously as the top ones. The main suppliersin 2007 were Hebei (03), Guangdong (19) and Jiangsu (10), but Hebei (03) was replacedwith Shandong (15) in 2012, indicating that the production capacity of Shandong (15) gotstrongly (from 2007 to 2012). For both years, the most notable receivers were Jiangsu (10),Zhejiang (11) and Guangdong (19). These three provinces have promoted a large numberof inter-provincial trade exchanges with the others across the nation. In fact, the majorityof the provinces presenting high inter-provincial trade amount were from the coastal region.In spite of the substantial drops in the proportions of inter-provincial trade in Guangdong(19) and Zhejiang (11) from 2007 to 2012, they were still top 5 over the nation and remainedthe driving forces contributing to the multi-regional economy in China. More quantitativeassessment calls for detailed network analyses.

Our approach to investigating the MRIOTs of China are network-based analytics. Let G ( V, E ) denote a directed network that consists of a set of vertices V and a set of edges E .By convention, each vertex i ∈ V represents a data point. Given a pair of vertices i, j ∈ V ,if there is a directed edge from i to j , then we have e ij ∈ E . Vice versa, the existence of7n edge e ij ∈ E suggests a (directed) link from i to j . One of the most popular ways ofdisplaying a network structure is adjacency matrix. For a network with n = | V | vertices, itsadjacency matrix is denoted by A := ( a ij ) n × n with a ij = 1 if e ij ∈ E and a ij = 0 otherwise.For a weighted and directed network, the weighted counterpart is denoted by W := ( w ij ) n × n ,where w ij represents the weight of edge e ij . The weighted adjacency matrix W is equivalentto A if w ij = 1 for all e ij ∈ E . In a directed network, the degree of a vertex i (denoted d i ) is comprised of in-degree (de-noted by d (in) i ) and out-degree (denoted by d (out) i ), which are, respectively, the number ofedges pointing into and emanating out of vertex i . To account for edge weight, we deﬁnethe in-strength and out-strength of vertex i as s (in) i := (cid:80) j ∈ V w ji and s (out) i := (cid:80) j ∈ V w ij ,respectively. The strength of vertex i , s i , is the sum of its in-strength and out-strength. Intraditional network analyses, degrees and strengths are used to show the importance of thevertices in a network (Newman, 2010).The degree distribution is the probability distribution π ( · ) of the vertex degrees over theentire network; that is, π ( k ) is the probability of a vertex having degree k ∈ { , , , . . . } .The degree distribution plays an important role in theoretical and applied network analyses.In a completely random network (Erd¨os and R´enyi, 1959), the degree distribution is Poisson,whereas the tail of the degree distribution of a scale-free network (Barab´asi and Albert, 1999)follows a power law. Pennock et al. (2002) pointed out that most real networks fall betweenthese two extreme classes. It is evident that the degree distributions of economic networksare likely to exhibit power-law patterns (Gabaix, 1999; Kaplow, 2008). The goodness-of-ﬁtof power-law tails can be tested based on the Kolmogorov–Smirnov statistic (Clauset et al.,2009) with p-values obtained from bootstrapping.In a weighted network, the strength distribution, which is based on the vertex strengths,usually better captures the network structure than the degree distribution. While the degree8istribution is always discrete, the strength distribution can be either discrete or continuous,depending on the characteristic of weight. As the MRIONs are weighted and directed, weconducted analogous analyses on the strength, in-strength and out-strength distributionsand made comparisons with their degree counterparts. Assortativity (or assortative mixing) refers to the tendency that the vertices in a networkare connected according to a pair of (vertex-speciﬁc) features (Newman, 2002). It is ameasure of homophily among the vertices based on two given features. A commonly usedassortativity measures is the degree-degree correlation (Newman, 2002; van der Hofstad andLitvak, 2014), which is analogous to Pearson correlation coeﬃcient. Its value is between − α, β ) ∈ { in , out } index the type of strength. The assortativity based on α - towards β -type strength is ρ α,β ( G ) = (cid:80) i,j ∈ V w ij (cid:104)(cid:16) s ( α ) i − ¯ s ( α )sou (cid:17) (cid:16) s ( β ) j − ¯ s ( β )tar (cid:17)(cid:105) W σ ( α )sou σ ( β )tar , (1)where W := (cid:80) i,j ∈ V w ij is the total weight, s ( α ) i is the α -type strength of source vertex i , s ( β ) j is the β -type strength of target vertex j ,¯ s ( α )sou = (cid:80) i,j ∈ V w ij s ( α ) i W and ¯ s ( β )tar = (cid:80) i,j ∈ V w ij s ( β ) j W α -type strength of the source vertices and β -typestrength of the target vertices, and σ ( α )sou = (cid:118)(cid:117)(cid:117)(cid:116) (cid:80) i,k ∈ V w ik (cid:16) s ( α ) i − ¯ s ( α )sou (cid:17) W and σ ( β )tar = (cid:118)(cid:117)(cid:117)(cid:116) (cid:80) k,j ∈ V w kj (cid:16) s ( β ) j − ¯ s ( β )tar (cid:17) W are the associated weighted standard deviations. A positive (negative) ρ α,β ( G ) suggestsassortative-mixing (disassortative-mixing), and zero assortativity indicates no obvious pat-tern of assortative- or disassortative-mixing.For a network like MRION, the weighted adjacency matrix can be decomposed intoone comprised of diagonal blocks only and another comprised of oﬀ-diagonal blocks. Theformer contains the information of economic transactions within each province (called intra-province), while the latter records the exchanges across multiple provinces (called inter-province). The proposed assortativity measure can be applied to the decomposed adjacencymatrices to investigate the correlation structures at the intra- and inter-province levels. Clustering coeﬃcient is a measure quantifying the tendency that the vertices in a networkare clustered together, usually characterized by a high density of connections among them(Opsahl and Panzarasa, 2009). Clustering coeﬃcient is also known as transitivity coeﬃcientin the literature (Newman et al., 2002). Classical clustering coeﬃcient was proposed forundirected and unweighted networks (Watts and Strogatz, 1998), and later were extendedto weighted and directed networks (Grindrod, 2002; Barrat et al., 2004; Onnela et al., 2005;Zhang and Horvath, 2005; Fagiolo, 2007; Clemente and Grassi, 2018).In the present study, we adopted the weighted and directed clustering coeﬃcients de-veloped by Clemente and Grassi (2018). The local clustering coeﬃcient of vertex i in anunweighted and undirected network G ( V, E ) is the ratio of the number of links connectingthe neighbors of i (i.e., { j ∈ V : e ij ∈ E } ) to the maximum possible value. When edges have10eights, the weighted adjacency matrix W plays an important role. Self-loops are removedprior to the computation since they do not practically contribute to the network clusteringproperty. The clustering coeﬃcient can be concisely expressed via matrix notations: C tot i = (cid:104)(cid:0) W + W (cid:62) (cid:1) (cid:0) A + A (cid:62) (cid:1) (cid:105) ii s i ( d i − − ( AW + W A ) ii ] , (2)where A (cid:62) is the transpose of A , and B ii is the ( i, i )th element of matrix B .The superscript distinguishes it from the four kinds of distinct local clustering coeﬃ-cients induced from four types of directed triangles (Fagiolo, 2007; Clemente and Grassi,2018). Namely, they are in-, out-, mid- and cyc-clustering coeﬃcients. When computing aspeciﬁc local clustering coeﬃcient, the denominator needs to be updated to the number ofcorresponding triplets. For instance, the local in-clustering coeﬃcient of i is the number oftriangles such that the neighbors (say, j and k ) both link towards i alongside with an edge (ineither direction) connecting j and k out of the number of triplets with both j and k generat-ing directed edges to i (disregarding whether or not j and k are connected). All of the otherlocal clustering coeﬃcients are deﬁned in an analogous manner. The local out-clusteringcoeﬃcient of i is the proportion of triangles which have two edges from i pointing to j and k and an edge linking j and k in either direction. The local mid-clustering coeﬃcient of i considers the proportion of the triangles in which i is a middleman: neighbor j (or k ) eitherhas a direct link to neighbor k (or j ) or forms a directed path j → i → k (or k → i → j ).The local cyc-clustering coeﬃcient of i only counts the triangles of which the directed edgesform a cycle. See Figure 8 in Appendix B for graphic illustrations and the formulae thereinfor practical computation. Accordingly, there are ﬁve kinds of global clustering coeﬃcientson the network base, obtained by averaging the associated local clustering coeﬃcients overall the vertices.Similar to assortativity, any kind of clustering coeﬃcients introduced in this sectioncan be applied to the decomposed adjacency matrices of MRION to uncover the clustering11roperties at the intra- and intro-provincial levels. Community detection aims to group the entities with similar characteristics in a network tothe same community. The entities in the same community are densely linked, while thosefrom diﬀerent communities are loosely linked. There are two major classes of communitydetection methods, model-based (Snijders and Nowicki, 1997; Handcock et al., 2007) andmetric-based (Girvan and Newman, 2002; Ouyang et al., 2020). In this study, we used ametric-based method for community detection in MRIONs. Interested readers are referredto Goldenberg et al. (2010) for a comprehensive survey for community detection techniques.Speciﬁcally, we exploited the modularity maximization algorithm proposed by Newman(2006). An objective function called modularity is deﬁned to measure the quality of clusteringstrategies, and is then maximized. The underlying principle of modularity maximization isthat the number of links among the vertices within a community is signiﬁcantly more thanexpected at random (based on the Erd¨os–R´enyi model which is generally used as the nullmodel), while the counterpart across diﬀerent communities is signiﬁcantly less. Newman’salgorithm is built upon recursive bi-partitioning. For a weighted and undirected network G ( V, E ), its modularity matrix B := ( b ij ) n × n is deﬁned as b ij := w ij − s i s j W , where W is identical to that deﬁned in Section 3.2. The term s i s j /W is interpreted as theexpected weight of the edges connecting i and j if all the edges are randomly placed amongthe vertices in the network. Let c := ( c i ) ni =1 denote a clustering strategy. A bi-partitioningalgorithm admits two clusters, so c i ’s take value 1 or − c , we deﬁne a modularity score as Q = 1 W (cid:88) i,j b ij I ( c i = c j ) , where I ( · ) is the indicator function.The expression of Q can be regarded as a reward-penalty system. Given c i = c j , thevalue of Q increases if b ij >

0, but decreases if b ij <

0. Besides, the larger b ij is (given c i = c j ), the more reward is granted. Subsequent bi-partitioning continues within eachresulting community until no more partition in any existing community leads to an increasein modularity score.For large networks, parsimonious algorithms (Clauset et al., 2004; Ng et al., 2001; Ouyanget al., 2020) are needed to solve the optimization problem. We adopted the greedy algorithmdeveloped by Clauset et al. (2004). The centrality of each vertex measures its relative importance in a network. Vertices withhigh centrality scores altogether form the main frame of the network. There are variousways of deﬁning centrality depending on practical needs and interpretations, such as degreecentrality (Barrat et al., 2004), closeness and betweenness (Newman, 2001), and eigenvectorcentrality (Bonacich, 1987). We considered a measure extended from eigenvector centrality,namely PageRank (PR, Brin and Page, 1998), that was originally used for ranking websitesby Google.We propose an extension of classical PR for weighted and directed networks. This exten-sion is diﬀerent from the existing ones that are speciﬁcally designed for analyzing citationnetworks (Xing and Ghorbani, 2004; Ding, 2011). We deﬁne the PR centrality of vertex i ∈ V as P i = γ (cid:88) j ∈ V (cid:32) θ w ji s (out) j + (1 − θ ) a ji d (out) j (cid:33) P j + (1 − γ ) β i (cid:80) i ∈ V β i , i = 1 , . . . , n, (3)13here θ ∈ [0 ,

1] is a tuning parameter indicating the proportion of edge weight (versus edgenumber) accounted in PR, γ is a damping factor that prevents the algorithm from gettingstuck in sinking vertices (those without outgoing edges), and β i is a prior measure (usuallyindependent of network structure) of the relative importance of vertex i . When there is noinformation available for γ , it takes value 0 .

85 as suggested by Page et al. (1998). In spite ofthe prior information speciﬁed by β i ’s, Equation (3) suggests that a vertex receives a highPR score (with θ = 1) if (i) it receives a large number of incoming edges from the others inthe network; (ii) the weights of the incoming edges linking to it are large; (iii) the sendersthemselves have high PR scores.When there is no prior information about β i ’s, they can be set the same, in which casethe second term in Equation (3) is simpliﬁed to (1 − γ ) /n . A standard method to solveEquation (3) is power iteration, but the convergence of this algorithm may be slow forlarge-scale networks. A remedy is to utilize the stochastic process theory and convert theproblem to ﬁnding the stable distribution of an underlying Markov Chain (Berkhin, 2005).The investigations of the crucial properties of the proposed PR measure will be reportedelsewhere. The backbone of a network is the fundamental but essential structure of a network (Xu andLiang, 2019). Non-essential links, which act like noise in a large network, can be removedwithout aﬀecting the backbone. Extracting the backbone of a massive and dense networklike MRION is critical, as hundreds of edges with minimal weights would overwhelm theanalysis. Proper removal of non-essential edges helps succinctly characterize a complex net-work system, and meanwhile enhance computation speed. There have been a few promisingbackbone extraction methods, such as the disparity ﬁlter method (Serrano et al., 2009; Zhangand Zhu, 2013), the locally adaptive network sparsiﬁcation algorithm (Foti et al., 2011), andtwo classes of node-based ﬁltering approaches (Ghalmane et al., 2020). We used the disparity14lter method which has been applied to the analysis of WIOTs by Xu and Liang (2019).The rational of the disparity ﬁlter method is as follows. Consider the normalized weightsof the d ∈ Z + edges of a vertex. Under the null hypothesis that the normalized weights aregenerated from a uniform random assignment, they can be regarded as obtained by dividingthe unit interval by ( d −

1) randomly placed points. The lengths of the subintervals, whichrepresent the normalized weights, have density function p ( x ; d ) = ( d − − x ) d − , x ∈ (0 , . An overly large normalized weight relative to this distribution means that the correspondingedge is unlikely to be from the uniform random assignment, which supports the correspondingedge to be part of the backbone. This idea can be formulated as obtaining the p-value ofeach normalized weight. Deﬁne ˜ w ij = w ij /s i . The p-value of ˜ w ij is δ ij ( ˜ w ij ; d i ) = (cid:90) w ij p ( x ; d i )d x. The backbone with level α ∈ (0 ,

1) is obtained by retaining only those edges whose p-valuesare less than α .For a directed network, normalized out- and in-strength of each edge can be deﬁnedsimilarly as ˜ w (out) ij = w ij /s (out) i and ˜ w (in) ij = w ij /s (in) j for all i, j ∈ V . The correspondingp-values are δ ij ( ˜ w (out) ij ; d (out) i ) and δ ij ( ˜ w (in) ij ; d (in) j ), respectively. For backbone with level α ∈ (0 , e ij is preserved if at least one of the two p-values is less than α .In some rare cases of d (out) i = 1, d (in) i = 1 or d (out) i = d (in) i = 1, special treatmentsmay be needed, depending on the speciﬁc features of the networks as well as the practicalinterpretation of network heterogeneity. These rare cases do not occur in our study. InFigure 2, we present the sub-networks of the MRION in 2012 consisting of the sectors onlyfrom top 5 regional GDP provinces with signiﬁcance level α ∈ { − , − } . Though the sub-networks remain dense, they relatively better reﬂect the basic structure of the MRION, and15 = - a = - ProvincesJiangsu Guangdong Shandong Zhejiang Henan

Figure 2: Examples of the sub-MRIONs comprised of the sectors from the provinces withtop 5 regional GDP in 2012. The self-loops are not presented. The signiﬁcance levels of α are respectively 10 − (left panel) and 10 − (right panel).furthermore, suggest province-based market fragmentation as well as community structure,which are consistent with some of the results shown in Section 4. We apply the methods in Section 3 to the 2007 and 2012 MRIONs of China, and presentthe corresponding results. The interpretations of the analysis results are given from bothstatistical and economic perspectives.

Figure 3 shows the histograms for the in-, out- and total-degree distributions of the MRIONsin 2007 and 2012. These degree distributions appear to share two features. First, there isa strictly positive probability of zero degree, which corresponds to province-sectors with no16 n−degree out−degree total degree degree c oun t Figure 3: In-, out- and total-degree distributions of the MRIONs in 2007 and 2012.edges (i.e., isolated vertices). For example, the sectors “coal mining and processing” (02),“metals mining/processing” (04) and “nonmetal mining/processing” (05) in Shanghai aresingletons with neither inbound nor outbound links. Second, the nonzero degrees are closeto the maximum degree and skewed to the left, more signiﬁcantly reﬂected in out-degreesthan in-degrees. This is a result of heavily connected province-sectors. From 2007 to 2012,all three degree distributions shifted to the right with more left skewness, suggesting theincrease in the number of links among the province-sectors during this period in China.More information about the magnitude of the economic transactions is provided in thestrength distributions on the log scale shown in Figure 4. The strength distributions aremixtures of a point mass at zero and a positive continuous distribution. Such distributionsare often used to model zero-inﬂated non-negative continuous data through the mechanism17 n−strength out−strength total strength strength c oun t Figure 4: In-, out- and total-strength distributions (after logarithmic transformation) of theMRIONs in 2007 and 2012.of two-part models or hurdle models (Liu et al., 2019). The mass at zero is inherited fromthe zero degrees from the degree distributions. The positive strengths on the log scale areskewed to the left. On the original scale, however, the positive strengths are skewed to theright. Similar to the degree distributions, the strength distributions all shifted to the rightfrom 2007 to 2012 with most of the quartiles more than doubled, reﬂecting an expansion ofthe economic transactions among the province-sectors during this period.The tail of the strength distribution is of great importance, especially in extreme valuetheory, as it characterizes the features of the distribution far way from the mean, indicat-ing the relative probability of the occurrence of some “unusual” events, i.e., extensivelylarge strength sectors. Speciﬁcally, we are interested in a particular class of heavy taildistributions—power laws, which, as mentioned in Section 3.1, have been observed in a vari-18 n−strength out−strength total−strength f u ll da t a t a il strengthyear t a il p r obab ili t y Figure 5: Tail distributions of in-, out- and total-strength of the MRIONs in 2007 and 2012(bottom two panels). Their correspondingly zoomed tails after the estimated thresholds aregiven in the top two panels.ety of economic networks. Figure 5 shows the empirical survival curves of the three strengthdistributions in 2007 and 2012 with both axis on the log scale. Empirical distributions ofsuch shapes appear to be typical for MRIONs (e.g., Xu and Liang, 2019). The tails of thedistributions show plausible linear patterns (on the log-scale) which are the characteristics ofpower laws. To verify, we performed goodness-of-ﬁt tests for power law tails (Clauset et al.,2009). Table 3 summarizes the estimated thresholds and exponent parameters as well as thep-values of the power law tails beyond the estimated thresholds obtained from bootstrap-ping. The power law provides adequate ﬁt to all three strength distributions in 2012 and theout-strength distribution in 2007; it is rejected at signiﬁcance level 0.01 for the in-strength19able 3: Estimated parameters for power-law tails for the in-, out- and total-strength distri-butions of the MRIONs in 2007 and 2012 and p-values of the goodness-of-ﬁt test; the unitof threshold is 1 million CNY. .

11 20 .

91 20 .

99 20 .

83 24 .

41 46 . .

65 3 .

56 2 .

84 3 .

19 3 .

29 3 . .

00 0 .

27 0 .

01 0 .

18 0 .

46 0 . Table 4: Five kinds of assortativity coeﬃcients (including “total” that does not account foredge direction) of the (national, intra-province and inter-province) MRIONs in 2007 and2012, where “UW” and “W” respectively represents the unweighted and weighted versionsof the assortativity measures.

Type 2007 2012National Intra-prov. Inter-prov. National Intra-prov. Inter-prov.UW W UW W UW W UW W UW W UW Win-in − .

010 0 .

501 0 .

258 0 . − .

033 0 . − .

003 0 .

573 0 .

209 0 . − . − . − .

001 0 .

447 0 .

124 0 . − . − . − .

001 0 .

516 0 .

109 0 . − . − . − .

123 0 .

493 0 .

007 0 . − .

136 0 . − .

110 0 .

563 0 .

010 0 . − .

111 0 . − .

024 0 .

474 0 .

070 0 . − .

028 0 .

047 0 .

015 0 .

536 0 .

095 0 .

618 0 . − . − .

070 0 .

418 0 .

140 0 . − .

080 0 . − .

030 0 .

457 0 .

158 0 . − .

031 0 . and total-strength distribution in 2007. This is consistent with the lower panels of Figure 5,which show the empirical conditional survival curves beyond the estimated thresholds. Forthe out-strength, the smaller exponent parameter estimate in 2012 than in 2007 indicatesheavier tails in the magnitude of the extremely large transactions in 2012 than in 2007. Table 4 summarizes a collection of assortativity coeﬃcients for the MRIONs of 2007 and2012. For each year, the assortativity coeﬃcients were computed for both directed (fourtypes) links and undirected links; intra-province links, inter-province links, and nationwidelinks; unweighted links and weighted links.Our ﬁrst observation is that the unweighted assortativity coeﬃcients (Newman, 2002;20oster et al., 2010) are not informative for characterizing the MRIONs. The unweightedversions are similar to the weighted versions only for inter-province links, which are allclose to zero. For nationwide links and intra-province links, the weighted and unweightedassortativity coeﬃcients of all ﬁve types, including four directed and one undirected, arenotably diﬀerent. The maximum magnitude of the unweighted version is only 0 .

123 (out-inin 2007), which is much lower than the magnitudes of the weighted versions in the range of0.4–0.6. The two versions have completely diﬀerent signs for all ﬁve types of nationwide linksin 2007. The unweighted versions suggest that there is a negligible pattern of disassortativemixing, while the weighted versions suggest assortative mixing. The weighted versions aremore consistent with intuition. Some existing analyses of the WIOTs without weight alsoreported “close to zero” assortativity coeﬃcients (e.g., Cerina et al., 2015), which need tobe revisited by using the weighted deﬁnition (Yuan et al., 2021).Based on the results from the weighted deﬁnitions, the assortativity coeﬃcients for na-tionwide links of all kinds have magnitude from 0 . .

6. These are moderately strongassortative mixing. Take the out-in assortativity for nationwide links as an example. Thecoeﬃcient of 0 .

493 in 2007 and 0 .

563 in 2012 suggest that province-sectors with large in-puts are likely to take high transaction volumes from the others with high outputs in thenetwork. Decomposing the weighted adjacency matrix helps separate the contributions re-spectively from intra-province and inter-province links. All ﬁve assortativity coeﬃcients forintra-province links are much greater than those for inter-province links, which are closeto zero. That is, economic transactions among the sectors from the same province are ex-tensively close, with high transaction volumes; in contrast, sectors across multiple regionsare relatively loosely connected, and the majority of the transaction volumes represented byexisting link weights are extremely small. The diﬀerences between intra- and inter-provinceassortativity coeﬃcients support the well-known regional fragmentation (Poncet, 2005).The assortativity coeﬃcients of all types for nationwide links and intra-province linksincreased from 2007 to 2012, while those for inter-province links remained close to zero.21

007 20120.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.000.450.500.550.60 alpha a ss o r t a t i v i t y type in−in in−out out−in out−out total Figure 6: Five kinds of assortativity coeﬃcients (national level) at a sequence of signiﬁcancelevels, α = { . , . , . . . , . } in 2007 and 2012.The increases in nationwide links are therefore attributed to the increase in intra-provincelinks, suggesting an increase in the degree of provincial segmentation. One possible ex-planation, for example, is that the economic stimulus plan after the 2008 ﬁnancial crisisstimulated the construction industry most, which propagated to upstream metal and non-metal mining/processing sectors within each province. Despite the increased inter-provincialtransactions, there is no clear assortative pattern among these transactions in contrast tothe intra-provincial transactions.The assortativity coeﬃcient provides a platform to demonstrate the eﬀectiveness of back-bone. Figure 6 shows the national assortativity coeﬃcients for the backbones of the MRIONsin 2007 and 2012 for a sequence of signiﬁcance levels α = { . , . , . . . , . } . For eachyear, the value of each type of assortativity coeﬃcient increases, but only slightly, with thedecrease of α due to the removal of the non-essential edges. The removed edges are sup-posed to impose limited impact on the overall structure of the network. As a result, the22able 5: Total, cycle-, middleman-, in- and out-clustering coeﬃcients of the MRIONs atnational, intra-province and inter-province levels in 2007 and 2012.Type 2007 2012National Intra-prov. Inter-prov. National Intra-prov. Inter-prov.Total 0.874 0.942 0.855 0.968 0.969 0.937Cycle 0.828 0.933 0.815 0.960 0.962 0.933Middleman 0.927 0.952 0.888 0.975 0.975 0.942In 0.914 0.949 0.881 0.968 0.971 0.935Out 0.843 0.939 0.819 0.966 0.967 0.939magnitude of change in each assortativity coeﬃcient is small. Further, for each α , all kindsof the assortativity coeﬃcients in 2012 are greater than their counterparts in 2007, which isconsistent with the results from Table 4. Therefore, backbone is a parsimonious and powerfultool for uncovering the fundamental and essential properties of a network, especially for thelarge-scale networks that are likely to cause computational expensiveness. Clustering coeﬃcients were computed with and without edge direction for three types oflinks, nationwide, intra-province, and inter-province, in the 2007 and 2012 MRIONs; seeTable 5. All the clustering coeﬃcients have large values (close to 1), providing strongerevidence than simple link densities for immense connectivity of the MRIONs. The largervalues of 2012 suggest a higher tendency that the province-sectors would cluster together interms of forming triangles. Decomposing the nationwide links to intra-province and inter-province links reveals that the nationwide increase from 2007 to 2012 was mainly due tothe increase in inter-province components. For example, the nationwide cycle-clusteringcoeﬃcient increased from 0.828 to 0.960; the intra-province coeﬃcient 0.933 in 2007 was quitehigh, leaving not much room to increase; the inter-province coeﬃcient increased from 0.815 in2007 to 0.933 in 2012. The emergence of more transactions across the inter-province sectorsmay be attributed to the goverment’s strategies and policies such as the Great Western23evelopment Strategy. The increase in inter-province transaction is not in contradition toits small proportion in the ovreall magnitude, so it did not aﬀect the manifestation of regionalfragmentation in China.Among the four types of directed clustering coeﬃcients, the cycle- and out-clusteringcoeﬃcients have increased more notably than the others. In a cyclic-triangle connection,each province-sector is an upstream as well as a downstream of its neighbors. A highervalue of cycle-clustering coeﬃcient indicates a higher proportion of triangular (supply anddemand) chains formed by the province-sectors. In an out-triangle connection, a province-sector is always the upstream to its neighbors. A higher value of out-clustering coeﬃcientsuggests an increased proportion transactions among the downstream sectors.

The community detection results from modularity maximization are visualized in two side-by-side heat maps in Figure 7 for 2007 and 2012. For each heat map, province-sectors inthe same community have the same color. Between the two years, however, the colors arenot comparable because these colors are nominal within each community detection task.There were 39 and 40 communities in 2007 and 2012, respectively. A common feature isthat most sectors from the same province belong to the same community. This is expected,as intra-province economic ties are naturally tighter than inter-province economic ties forgeographic, historical, and administrative reasons. An interesting discovery is that heavyindustry sectors, such as “coal mining and processing” (02), “petroleum and gas extract-ing” (03), and “metals mining/processing” (04) usually form singletons independent fromprovince-based communities. For example, Shanghai as a manufacturing and business centerusually inquires a high demand of raw materials like coal, which heavily relies on the suppliesfrom other provinces. Consequently, “coal mining and processing” (02) of Shanghai forms asingleton instead of falling into the same community formed by the most of the other sectorsin Shanghai. 24

007 2012

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

BeijingTianjinHebeiShanxiInnerMongoliaLiaoningJilinHeilongjiangShanghaiJiangsuZhejiangAnhuiFujianJiangxiShandongHenanHubeiHunanGuangdongGuangxiHainanChongqingSichuanGuizhouYunnanShaanxiGansuQinghaiNingxiaXinjiang sector p r o v i n c e Figure 7: Communities of the MRION in 2007 (left) and 2012 (right).From 2007 to 2012, the community structure has shown a notable change. Sectors fromthe provinces in the same geographic region tended to stay in the same community in 2007.For example, three northeastern provinces Heilongjiang, Jilin, and Liaoning were in onecommunity; four northwestern provinces Xinjiang, Ningxia, Qinghai, and Gansu belongedto another community; two central north provinces Shanxi and Hebei were placed in thesame community. In 2012, however, this pattern was no longer observed, each provinceappearing to be a community of its own. A closer look at the data reveals that the growthrate of inter-province trade is much smaller than that of intra-province trade. In 2007, therewere 9 . . . . The PR scores were used to rank the relative importance of province-sectors in the Chineseeconomy. The dampling factor was set to be γ = 0 .

85 as recommended by Brin and Page(1998). For diﬀerent values of θ = { , } , we computed the PR scores for the 900 province-sectors in 2007 and 2012 with or without accounting for prior information. Speciﬁcally, thetotal value added (TVA) of each province-sector was adopted as node-speciﬁc prior infor-mation, as it indicates the value added contributed by each sector to the national economy.The TVAs of province-sectors in each year are recorded in quadrat IV of the MRIOT; seeTable 1. We present the province-sectors with top 10 PR scores in Table 6.Without considering the weight ( θ = 0), most of the top 10 province-sectors are fromBeijing in 2007 and from Hainan in 2012. Beijing’s top ranking in 2007 may be explainedby its special function as the nation’s capital with an advantage in access to the resourcesnationwide. The preparation for the 2008 Olympic Games in urban infrastructure, ecologicalenvironment, electronic technology and other aspects had a huge pull eﬀect on the economicdevelopment of Beijing, especially in the manufacturing sectors. Nonetheless, the sizes of26able 6: The top 10 province-sectors in PR scores ( γ = 0 .

85) from the MRIONs in 2007 and2012 with and without TVA (total value-added) as prior information.

Rank θ = 0 θ = 1No prior No prior Using TVA as prior2007 2012 2007 2012 2007 20121 Beijing 15 Hainan 21 Beijing 30 Yunnan 24 Guangdong 30 Yunnan 242 Beijing 10 Hainan 12 Guangdong 30 Zhejiang 24 Beijing 30 Guangdong 303 Beijing 14 Hainan 06 Yunnan 24 Guangdong 24 Guangdong 19 Guangdong 244 Beijing 16 Hainan 17 Guangdong 19 Jiangsu 17 Jiangsu 30 Zhejiang 245 Beijing 12 Hainan 14 Guangdong 18 Hebei 24 Shandong 06 Jiangsu 306 Beijing 13 Hainan 16 Guangdong 17 Shaanxi 24 Zhejiang 30 Jiangsu 177 Beijing 17 Hainan 09 Jiangsu 30 Shanghai 24 Guangdong 18 Shandong 068 Jiangsu 17 Hainan 13 Guangdong 08 Qinghai 24 Shanghai 30 Jiangsu 129 Jiangsu 18 Tianjin 12 Shandong 06 Sichuan 24 Jiangsu 12 Shandong 1210 Jiangsu 22 Hebei 06 Shanghai 30 Xinjiang 24 Shandong 30 Guangdong 19 the sectors in Beijing are smaller than those in eastern coastal provinces such as Guangdongor Jiangsu. The top ranking of Beijing without weight is, therefore, not consistent with wideperceptions. The top ranking of seven sectors in Hainan in 2012 is even more puzzling. Asan island with a small population, Hainan is known to be relatively less developed comparedwith other provinces in China. The unweighted PR scores are not satisfying in measuringthe centrality in the context of MRIONs.With PR scores fully based on weights instead of counts ( θ = 1), the top 10 province-sectors have changed dramatically. Province-wise, most of the top 10 province-sectors in 2007are from the eastern coast (Guangdong, Jiangsu, Shandong, and Shanghai). These provincesare more developed than others, and consistently make large contribution to the nationalGDP. In 2012, however, some sectors from much less developed provinces (Shaanxi, Qinghai,and Xinjiang) joined those from traditional developed provinces in the top 10. This may be aresult of these less developed provinces in northwest China beneﬁting from the eﬀectivenessof the China Western Development policy. Sector-wise, the sector of “other services” (30)appeared most often in the top 10 in 2007, but “construction” (24) became dominant in 2012.Note that “other services” cover some essential services like ﬁnancial services and information27echnology, both of which are critical to modern economic development. It is evident thatthese services have provided unprecedented support to the growth of many other sectors inChina in the early 2000s. The 2008 global ﬁnancial crisis stroke many such services. Muchof the four-trillion CNY stimulus program funded projects like railway, highway, bridge, andaviation construction. Further, one of the aims of the China Western Development programwas to strengthen the infrastructure construction in the participating provinces.Utilization of TVA as prior information led to noticeable changes in both lists of top 10PR province-sectors. The changed results are more consistent with the intuition more thanotherwise. For 2007, Guangdong became less dominant than otherwise, albeit still with thehighest frequency in the top 10. Most of the provinces came from the eastern/southern coast(except for Beijing). Diﬀerent from the diversity in province, “other services” (30) appearedto be most inﬂuential sector-wise, as it occupied six positions of the top ten. The impactof the TVA prior is more notable in 2012 than in 2007. Sectors from western provinceslike Shaanxi, Qinghai and Xinjiang were gone, while sectors from the coastal provincesemerged in the top 10 list. The “construction” (24) sector is less dominant, but remainingmost frequent in the top 10. Other leading province-sectors are “other services” (30) fromGuangdong and Jiangsu and “chemical industry” (12) from Jiangsu and Shandong. Theupdated results with the TVA prior makes more sense because TVA contains informationabout the self-loops which were otherwise discarded but are useful in assessing centrality.It is worth special attention that “construction” (24) of Yunnan is top 1 in 2012 in spiteof the inclusion of prior information. Although a developing inland province, Yunnan is oneof the largest tourist province in China. The fast development of tourism in Yunan hasboosted the development of infrastructure construction such as transportation facilities andhotel accommodations. Located in the geographical center of Asia, connecting southeastAsia with China and inland with coastal regions, Yunnan has been crucial to the ChinaWestern Development. With favorable domestic policies and economic cooperation withSoutheast Asian countries, Yunnan’s GDP grew with a rate consistently higher than the28able 7: The province-sectors with top 10 PR scores ( γ = 0 . θ = 1) of the backbonesof MRIONs in 2007 and 2012 at signiﬁcance level α = { − , − } . TVA is used as priorinformation. The blue province-sectors with * do not appear in the corresponding columnsfor α = 1 (original networks). Rank α = 1 α = 10 − α = 10 − national average during this period, for which the construction sector played an importantpulling role (Su, 2014).The PR score provides another opportunity to demonstrate the eﬀectiveness of backbone.Table 7 summarizes the province-sectors with top 10 PR scores in the backbones of MRIONswith signiﬁcance level α = { − , − } . No drastic change in the lists for both 2007 and 2012are observed. Only two new province-sectors, “clothing, leather, fur” (08) and “transportequipment” (17) from Guangdong, ranked among the top 10 in both of the backbones butnot in the original 2007 MRION. In fact, they were ranked respectively at 13 and 14 inthe original MRION, with small diﬀerence in the magnitude of PR score from the bottomof top 10. For 2012, the province-sectors in the top 10 list remained the same for bothbackbones, in spite of some changes in order. The traditional strong sectors “construction”(24) and ”other services” (30) in Guangdong, “transport equipment” (17) in Jiangsu and“construction” (24) in Zhejiang surpassed “construction” (24) in Yunnan. This was because“construction” (24) in Yunnan is more connected by insigniﬁcant edges in the MRION of2012, but these edges are ﬁltered out in the backbones. Between the two lists for thebackbones at diﬀerent signiﬁcance levels, we only observe small dispersion in province-sectororders. For instance, in the two lists of 2012, the province-sectors at rank 3 and 4 and29hose at rank 9 and 10 were respectively switched, with the rest remaining identical. Thiscomparison, again, shows that backbone is capable of capturing the centrality measures ofthe vertices in the MRIONs. MRIOTs provide a natural arena for network analyses to study regional and sectoral struc-ture of an economy. Our study of MRIONs of China in 2007 and 2012 is the ﬁrst networkanalysis of its kind for the Chinese economy. All three types of strength distributions (in,out, and total) were found to be skewed to the left, where the degree of skewness of each in-creases over time. For each MRION, the positive assortative coeﬃcients suggests assortativemixing across the province-sectors, especially intra-province-sectors. As indicated by close-to-one clustering coeﬃcients, the province-sectors tend to cluster together, and the tendencyincreased inter-province transactions. Province-based community structures were detected.There were communities containing multiple provinces In 2007 but none in 2012, suggestingincreased regional fragmentation. The most essential province-sectors in the Chinese econ-omy were found through a new class of weighted PR measures. Province-wise, Guangdong,Jiangsu and many eastern coastal provinces contain most sectors in the top list. Sector-wise,“construction” (24) and “other services” (30) appear to be dominant.Our study suggests a few methodological caveats for network analysis of MRIONs. First,it is critical to account for edge weight when summarizing MRIONs as these networks areweighted; otherwise, misleading inference may be made. For example, unweighted assorta-tivity coeﬃcients suggest almost no assortative (or disassortative) mixing in the MRIONs,whereas the weighted counterparts indicate a moderately positive assortative mixing. Whileno weight is accounted, the classical PR algorithm have produced a top 10 list in 2012 con-taining 8 sectors form Hainan, which is not consistent with the fact. Rather, the weightedPR algorithm (with or without using TVA as prior information) has provided a much more30easonable result. Second, precise interpretation of the network measures is crucial. Forinstance, clustering coeﬃcient is a relative measure. A large local clustering coeﬃcient of avertex only tells how likely its two neighbors are connected; it says nothing about how manyneighbors it has. For instance, the province-sector of the highest local clustering coeﬃcientin 2012 is given to “construction” (24) from Inner Mongolia according to the computation.This speciﬁc province-sector receives a high value since it has fewer neighbors compared tothe rest in the nation, leading to a higher proportion conversely. Lastly, when ranking thevertices in a network, we recommend to make full use of the possible vertex-speciﬁc auxiliaryinformation, which helps lead to more intuitive results. These caveats may be applicable tonetworks beyond MRIONs.

Acknowledgments

Tao Wang and Shiying Xiao were supported by the National Bureau of Statistics of China(2018lz33).

A Province codes

Table 8 summarizes the code and the names of the 30 provinces used in the chord graph inFigure 1.

B Four types of local clustering coeﬃcients

In practice, the formulations of the four kinds of local clustering coeﬃcients are respectivelygiven by C in i = (cid:2) W (cid:62) (cid:0) A + A (cid:62) (cid:1) A (cid:3) ii s (in) i (cid:16) d (in) i − (cid:17) , (4)31able 8: Codes of the provinces in the MRIONsCode Province Code Province01 Beijing 16 Henan02 Tianjin 17 Hubei03 Hebei 18 Hunan04 Shanxi 19 Guangdong05 Inner Mongolia 20 Guangxi06 Liaoning 21 Hainan07 Jilin 22 Chongqing08 Heilongjiang 23 Sichuan09 Shanghai 24 Guizhou10 Jiangsu 25 Yunnan11 Zhejiang 26 Shaanxi12 Anhui 27 Gansu13 Fujian 28 Qinghai14 Jiangxi 29 Ningxia15 Shandong 30 Xinjiang C out i = (cid:2) W (cid:0) A + A (cid:62) (cid:1) A (cid:62) (cid:3) ii s (out) i (cid:16) d (out) i − (cid:17) , (5) C mid i = (cid:0) W (cid:62) AA (cid:62) + W A (cid:62) A (cid:1) ii (cid:16) s (in) i d (out) i + s (out) i d (in) i (cid:17) − ( AW + W A ) ii , (6) C cyc i = (cid:104) W A + W (cid:62) (cid:0) A (cid:62) (cid:1) (cid:105) ii (cid:16) s (in) i d (out) i + s (out) i d (in) i (cid:17) − ( AW + W A ) ii . (7) References

Amador, J. and S. Cabral (2017). Networks of value-added trade.

The World Economy 40 (7),1291–1313.Barab´asi, A.-L. and R. Albert (1999). Emergence of scaling in random networks.

Sci-ence 286 (5439), 509–512.Barrat, A., M. Barth´elemy, R. Pastor-Satorras, and A. Vespignani (2004). The architecture32 ycle: i jk i jk

Middleman: i jk i jk

In: i jk i jk

Out: i jk i jk

Figure 8: Four types of triangles proposed in Fagiolo (2007), and reused in Clemente andGrassi (2018).of complex weighted networks.

Proceedings of the National Academy of Sciences of theUnited States of America 101 (11), 3747–3752.Bergstrand, J. H. (1985). The gravity equation in international trade: Some microeconomicfoundations and empirical evidence.

The Review of Economics and Statistics 67 (3), 474–481.Berkhin, P. (2005). A survey on PageRank computing.

Internet Mathematics 2 (1), 73–120.Bonacich, P. (1987). Power and centrality: A family of measures.

American Journal ofSociology 92 (5), 1170–1182. 33rin, S. and L. Page (1998). The anatomy of a large-scale hypertextual web search engine.

Computer Networks and ISDN Systems 30 (1), 107–117.Cerina, F., Z. Zhu, A. Chessa, and M. Riccaboni (2015, 07). World input-output network.

PLoS ONE 10 (7), 1–21.Chen, D., S. Khan, X. Yu, and Z. Zhang (2013). Government intervention and investmentcomovement: Chinese evidence.

Journal of Business Finance & Accounting 40 (3–4), 564–587.Chen, S., G. H. Jeﬀerson, and J. Zhang (2011). Structural change, productivity growth andindustrial transformation in China.

China Economic Review 22 (1), 133–150.Chow, G. C. (2003). Impact of joining the WTO on China’s economic, legal and politicalinstitutions.

Paciﬁc Economic Review 8 (2), 105–115.Clauset, A., M. E. J. Newman, and C. Moore (2004). Finding community structure in verylarge networks.

Physical Review E 70 (6), 066111.Clauset, A., C. R. Shalizi, and M. E. J. Newman (2009). Power-law distributions in empiricaldata.

SIAM Review 51 (4), 661–703.Clemente, G. P. and R. Grassi (2018). Directed clustering in weighted networks: A newperspective.

Chaos, Solitons & Fractals 107 , 26–38.Defourny, J. and E. Thorbecke (1984). Structural path analysis and multiplier decompositionwithin a social accounting matrix framework.

The Economic Journal 94 (373), 111–136.del R´ıo-Chanona, R. M., J. Gruji´c, and H. Jeldtoft Jensen (2017). Trends of the world inputand output network of global trade.

PLoS ONE 12 (1), 1–14.Ding, Y. (2011). Applying weighted PageRank to author citation networks.

Journal of theAmerican Society for Information Science and Technology 62 (2), 236–245.34rd¨os, P. and A. R´enyi (1959). On random graphs. i.

Publicationes Mathematicae Debrecen 6 ,290–297.Fagiolo, G. (2007). Clustering in complex directed networks.

Physical Review E 76 , 026107.Fan, S., R. Kanbur, and X. Zhang (2011). China’s regional disparities: Experience andpolicy.

Review of Development Finance 1 (1), 47–56.Fan, S., X. Zhang, and S. Robinson (2003). Structural change and economic growth in China.

Review of Development Economics 7 (3), 360–377.Feser, E. J. and E. M. Bergman (2000). National industry cluster templates: A frameworkfor applied regional cluster analysis.

Regional Studies 34 (1), 1–19.Foster, J. G., D. V. Foster, P. Grassberger, and M. Paczuski (2010). Edge direction andthe structure of networks.

Proceedings of the National Academy of Sciences of the UnitedStates of America 107 (24), 10815–10820.Foti, N. J., J. M. Hughes, and D. N. Rockmore (2011). Nonparametric sparsiﬁcation ofcomplex multiscale networks.

PLoS ONE 6 (2), 1–10.Fujikawa, K. and C. Milana (2002). Input-output decomposition analysis of sectoral pricegaps between Japan and China.

Economic Systems Research 14 (1), 59–79.Gabaix, X. (1999). Zipf’s law for cities: An explanation.

The Quarterly Journal of Eco-nomics 114 (3), 739–767.Ghalmane, Z., C. Cheriﬁ, H. Cheriﬁ, and M. El Hassouni (2020). Extracting backbones inweighted modular complex networks.

Scientiﬁc Reports 10 (1), 15539.Girvan, M. and M. E. J. Newman (2002). Community structure in social and biologicalnetworks.

Proceedings of the National Academy of Sciences of the United States of Amer-ica 99 (12), 7821–7826. 35oldenberg, A., A. X. Zheng, S. E. Fienberg, and E. M. Airoldi (2010). A survey of statisticalnetwork models.

Foundations and Trends in Machine Learning 2 (2), 129–233.Grindrod, P. (2002). Range-dependent random graphs and their application to modelinglarge small-world proteome datasets.

Physical Review E 66 , 066702.Handcock, M. S., A. E. Raftery, and J. M. Tantrum (2007). Model-based clustering for socialnetworks.

Journal of the Royal Statistical Society: Series A (Statistics in Society) 170 ,301–354.He, C. and J. Wang (2012). Regional and sectoral diﬀerences in the spatial restructuringof Chinese manufacturing industries during the post-WTO period.

GeoJournal 77 (3),361–381.He, C., Y. Yan, and D. Rigby (2018). Regional industrial evolution in China.

Papers inRegional Science 97 (2), 173–198.International Monetory Fund (2020). GDP, current prices [data ﬁle]. Available online: .Ji, J., Z. Zou, and Y. Tian (2019). Energy and economic impacts of China’s 2016 economicinvestment plan for transport infrastructure construction: An input-output path analysis.

Journal of Cleaner Production 238 , 117761.Jia, J., G. Ma, C. Qin, and L. Wang (2020). Place-based policies, state-led industriali-sation, and regional development: Evidence from China’s Great Western DevelopmentProgramme.

European Economic Review 123 , 103398.Jiang, M., P. Behrens, T. Wang, Z. Tang, Y. Yu, D. Chen, L. Liu, Z. Ren, W. Zhou, S. Zhu,C. He, A. Tukker, and B. Zhu (2019). Provincial and sector-level material footprints inChina.

Proceedings of the National Academy of Sciences of the United States of Amer-ica 116 (52), 26484–26490. 36aplow, L. (2008). Pareto principle and competing principles. In S. N. Durlauf and L. E.Blume (Eds.),

The New Palgrave Dictionary of Economics , pp. 4807–4812. London, UK:Palgrave Macmillan.Lee, B.-S., J. Peng, G. Li, and J. He (2012). Regional economic disparity, ﬁnancial dis-parity, and national economic growth: Evidence from China.

Review of DevelopmentEconomics 16 (2), 342–358.Leonidov, A. and E. Serebryannikova (2019). Dynamical topology of highly aggregatedinput–output networks.

Physica A: Statistical Mechanics and its Applications 518 , 234–252.Leontief, W. and A. Strout (1963). Multiregional input-output analysis. In T. Barna (Ed.),

Structural Interdependence and Economic Development: Proceedings of an InternationalConference on Input-Output Techniques, Geneva, September 1961 , pp. 119–150. London,UK: Palgrave Macmillan UK.Li, H. and K. E. Haynes (2011). Economic structure and regional disparity in China: Beyondthe Kuznets transition.

International Regional Science Review 34 (2), 157–190.Liang, S., Y. Wang, T. Zhang, and Z. Yang (2017). Structural analysis of material ﬂows inChina based on physical and monetary input-output models.

Journal of Cleaner Produc-tion 158 , 209–217.Lin, J. Y. (2011). China and the global economy.

China Economic Journal 4 (1), 1–14.Liu, L., Y.-C. T. Shih, R. L. Strawderman, D. Zhang, B. A. Johnson, and H. Chai (2019).Statistical analysis of zero-inﬂated nonnegative continuous data: A review.

StatisticalScience 34 (2), 253–279.Liu, W., J. Chen, Z. Tang, H. Liu, D. Han, and F. Li (2012).

Theory and Practice of ompiling China 30-Province Inter-Regional Input-Output Table of 2007 . Beijing, China:China Statistics Press.Liu, W., Z. Tang, J. Chen, and P. Yang (2014). The 2010 China Multi-Regional Input-OutputTable of 30 Provincial Units . Beijing, China: China Statistics Press.Liu, W., Z. Tang, and M. Han (2018).

The 2012 China Multi-Regional Input-Output Tableof 31 Provincial Units . Beijing, China: China Statistics Press.Meng, B., Y. Fang, J. Guo, and Y. Zhang (2017). Measuring China’s domestic productionnetworks through trade in value-added perspectives.

Economic Systems Research 29 (1),48–65.Mi, Z., J. Meng, H. Zheng, Y. Shan, and D. Guan (2018). A multi-regional input-output tablemapping China’s economic outputs and interdependencies in 2012.

Scientiﬁc Data 5 (1),180155.Moses, L. N. (1955). The stability of interregional trading patterns and input-output analysis.

The American Economic Review 45 (5), 803–826.Newman, M. E. J. (2001). Scientiﬁc collaboration netowrks. ii. Shortest paths, weightednetworks, and centrality.

Physical Review E 64 (1), 016132.Newman, M. E. J. (2002). Assortative mixing in networks.

Physical Review Letters 89 (20),208701.Newman, M. E. J. (2003). Mixing patterns in networks.

Physical Review E 67 (2), 026126.Newman, M. E. J. (2006). Modularity and community structure in networks.

Proceedings ofthe National Academy of Sciences of the United States of America 103 (23), 8577–8582.Newman, M. E. J. (2010).

Networks: An Introduction . New York, NY, USA: OxfordUniversity Press. 38ewman, M. E. J., D. J. Watts, and S. H. Strogatz (2002). Random graph models ofsocial networks.

Proceedings of the National Academy of Sciences of the United States ofAmerica 99 (suppl 1), 2566–2572.Ng, A. Y., M. I. Jordan, and Y. Weiss (2001). On spectral clustering: Analysis and analgorithm. In T. G. Dietterich, S. Becker, and Z. Ghahramani (Eds.),

Proceedings of the14th International Conference on Neural Information Processing Systems: Natural andSynthetic , Cambridge, MA, USA, pp. 849–856. The MIT Press.Noldus, R. and P. Van Mieghem (2015). Assortativity in complex networks.

Journal ofComplex Networks 3 (4), 507–542.Onnela, J.-P., J. Saram¨aki, J. Kert´esz, and K. Kaski (2005). Intensity and coherence ofmotifs in weighted complex networks.

Physical Review E 71 , 065103.Opsahl, T. and P. Panzarasa (2009). Clustering in weighted networks.

Social Networks 31 (2),155–163.Ouyang, G., D. K. Dey, and P. Zhang (2020). Clique-based method for social networkclustering.

Journal of Classiﬁcation 37 (1), 254–274.Ouyang, M. and Y. Peng (2015). The treatment-eﬀect estimation: A case study of the 2008economic stimulus package of China.

Journal of Econometrics 188 (2), 545–557.Page, L., S. Brin, R. Motwani, and T. Winograd (1998). The PageRank citation ranking:Bringing order to the web. In P. H. Enslow and A. Ellis (Eds.),

Proceedings of the 7thInternational World Wide Web Conference , pp. 161–172. Elsevier.Pennock, D. M., G. W. Flake, S. Lawrence, E. J. Glover, and C. L. Giles (2002). Winnersdon’t take all: Characterizing the competition for links on the web.

Proceedings of theNational Academy of Sciences of the United States of America 99 (8), 5207–5211.39oncet, S. (2005). A fragmented China: Measure and determinants of Chinese domesticmarket disintegration.

Review of International Economics 13 (3), 409–430.Sargento, A. L. M. (2007). Empirical examination of the gravity model in two diﬀerentcontexts: Estimation and explanation.

Review of Regional Research 27 (2), 107–127.Serrano, M. ´A., M. Bogu˜n´a, and A. Vespignani (2009). Extracting the multiscale backboneof complex weighted networks.

Proceedings of the National Academy of Sciences of theUnited States of America 106 (16), 6483–6488.Snijders, T. A. B. and K. Nowicki (1997). Estimation and prediction for stochastic block-models for graphs with latent block structure.

Journal of Classiﬁcation 14 (1), 75–100.Su, X. (2014). Multi-scalar regionalization, network connections and the development ofYunnan province, China.

Regional Studies 48 (1), 91–104.The World Bank (2020a). Exports of goods and services (current US $ ) [data ﬁle]. Availableonline: https://data.worldbank.org/indicator/NE.EXP.GNFS.CD .The World Bank (2020b). GDP (current US $ ) [data ﬁle]. Available online: https://data.worldbank.org/indicator/NY.GDP.MKTP.CD .The World Bank (2020c). GDP growth (annual %) [data ﬁle]. Available online: https://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG .The World Bank (2020d). Imports of goods and services (current US $ ) [data ﬁle]. Availableonline: https://data.worldbank.org/indicator/NE.IMP.GNFS.CD .The World Bank (2020e). Inﬂation, GDP deﬂator (annual %) - China [data ﬁle]. Availableonline: https://data.worldbank.org/indicator/NY.GDP.DEFL.KD.ZG?locations=CN .Timmer, M. P., E. Dietzenbacher, B. Los, R. Stehrer, and G. J. de Vries (2015). An illus-trated user guide to the World Input–Output Database: The case of global automotiveproduction. Review of International Economics 23 (3), 575–605.40ukker, A. and E. Dietzenbacher (2013). Global multiregional input-output framework: Anintroduction and outlook.

Economic Systems Research 25 (1), 1–19.van der Hofstad, R. and N. Litvak (2014). Degree-degree dependencies in random graphswith heavy-tailed degrees.

Internet Mathematics 10 (3–4), 287–334.Watts, D. J. and S. H. Strogatz (1998). Collective dynamics of ‘small-world’ networks.

Nature 393 (6684), 440–442.Xing, W. and A. A. Ghorbani (2004). Weighted PageRank algorithm. In

Proceedings of the2nd Annual Conference on Communication Networks and Services Research , Piscataway,NJ, USA, pp. 305–314. IEEE.Xu, M. and S. Liang (2019). Input-output networks oﬀer new insights of economic structure.

Physica A: Statistical Mechanics and its Applications 527 , 121178.Young, A. (2000). The razor’s edge: Distortions and incremental reform in the People’sRepublic of China.

The Quarterly Journal of Economics 115 (4), 1091–1135.Yuan, C., S. Liu, and N. Xie (2010). The impact on Chinese economic growth and energyconsumption of the Global Financial Crisis: An input–output analysis.

Energy 35 (4),1805–1812.Yuan, Y., J. Yan, and P. Zhang (2021). Assortativity coeﬃcients for weighted and directednetworks. ArXiv preprint, arXiv:2101.05389.Zhang, B. and S. Horvath (2005). A general framework for weighted gene co-expression net-work analysis.

Statistical Applications in Genetics and Molecular Biology 4 (1), Article17.Zhang, X. and J. Zhu (2013). Skeleton of weighted social network.