[PDF] Collective Learning in China's Regional Economic Development

Abstract

Industrial development is the process by which economies learn how to produce new products and services. But how do economies learn? And who do they learn from? The literature on economic geography and economic development has emphasized two learning channels: inter-industry learning, which involves learning from related industries; and inter-regional learning, which involves learning from neighboring regions. Here we use 25 years of data describing the evolution of China's economy between 1990 and 2015--a period when China multiplied its GDP per capita by a factor of ten--to explore how Chinese provinces diversified their economies. First, we show that the probability that a province will develop a new industry increases with the number of related industries that are already present in that province, a fact that is suggestive of inter-industry learning. Also, we show that the probability that a province will develop an industry increases with the number of neighboring provinces that are developed in that industry, a fact suggestive of inter-regional learning. Moreover, we find that the combination of these two channels exhibit diminishing returns, meaning that the contribution of either of these learning channels is redundant when the other one is present. Finally, we address endogeneity concerns by using the introduction of high-speed rail as an instrument to isolate the effects of inter-regional learning. Our differences-in-differences (DID) analysis reveals that the introduction of high speed-rail increased the industrial similarity of pairs of provinces connected by high-speed rail. Also, industries in provinces that were connected by rail increased their productivity when they were connected by rail to other provinces where that industry was already present. These findings suggest that inter-regional and inter-industry learning played a role in China's great economic expansion.

Full PDF

aa r X i v : . [ q -f i n . E C ] M a r Collective Learning in China’s Regional Economic Development

Jian Gao a,b,c , Bogang Jun b , Alex “Sandy” Pentland b , Tao Zhou a,c , C´esar A. Hidalgo b, ∗ a CompleX Lab, University of Electronic Science and Technology of China, Chengdu 611731, China b MIT Media Lab, Massachusetts Institute of Technology, Cambridge, MA 02139, USA c Big Data Research Center, University of Electronic Science and Technology of China, Chengdu 611731, China

Abstract

Industrial development is the process by which economies learn how to produce new products and services. But howdo economies learn? And who do they learn from? The literature on economic geography and economic developmenthas emphasized two learning channels: inter-industry learning, which involves learning from related industries; andinter-regional learning, which involves learning from neighboring regions. Here we use 25 years of data describingthe evolution of China’s economy between 1990 and 2015–a period when China multiplied its GDP per capita by afactor of ten–to explore how Chinese provinces diversiﬁed their economies. First, we show that the probability that aprovince will develop a new industry increases with the number of related industries that are already present in thatprovince, a fact that is suggestive of inter-industry learning. Also, we show that the probability that a province willdevelop an industry increases with the number of neighboring provinces that are developed in that industry, a fact sug-gestive of inter-regional learning. Moreover, we ﬁnd that the combination of these two channels exhibit diminishingreturns, meaning that the contribution of either of these learning channels is redundant when the other one is present.Finally, we address endogeneity concerns by using the introduction of high-speed rail as an instrument to isolate thee ﬀ ects of inter-regional learning. Our di ﬀ erences-in-di ﬀ erences (DID) analysis reveals that the introduction of highspeed-rail increased the industrial similarity of pairs of provinces connected by high-speed rail. Also, industries inprovinces that were connected by rail increased their productivity when they were connected by rail to other provinceswhere that industry was already present. These ﬁndings suggest that inter-regional and inter-industry learning playeda role in China’s great economic expansion. Keywords:

Collective Learning, Economic Development, Industrial Structure, Economic Complexity, Product Space ∗ Email address : [email protected]

Preprint submitted to arXiv March 7, 2017 . Introduction

Between 1990 and 2015 China experienced one of the fastest episodes of economic growth in our recorded history.China’s overall GDP grew by a factor of 30, from less than USD 400 billion in 1990 to more than USD 10 trillion in2015. The per capita economic growth of China was also outstanding. China’s GDP per capita, adjusted by purchasingpower parity (PPP) and at constant prices, increased by nearly a factor of 10, from USD 1,516 in 1990 to more thanUSD 13,400 in 2015. For comparison, in the same period global GDP grew only by a factor of three (from USD 22.5trillion to USD 73.4 trillion) and global GDP per capita, also at PPP and constant prices, grew by less than a factor oftwo (from USD 8,876 to USD 14,602).The pace and scale of China’s great economic expansion have no historical precedent (Song et al., 2011; Zhu,2012; Eichengreen et al., 2012; Felipe et al., 2013). If China’s GDP per capita, at PPP and constant prices, continuedgrowing at the same pace, it would surpass USD 130,000 per capita by the year 2040. But China is unlikely to repeatthis success in the next 25 years. This suggests that China’s great expansion was probably a distinct event in economichistory, and one from which many countries could learn.But what explains China’s remarkable economic success? One theory is that China’s great expansion relied on theexport of products that were unusually sophisticated for China’s level of income (Rodrik, 2006; Hidalgo and Hausmann,2009; Hausmann et al., 2014; Hidalgo, 2015). During this period China exported products like electronics and otheradvanced manufactures that were at that time being produced mostly in countries with an income per capita that wasmuch larger than that of China (Lin, 2012). By succeeding in the export of these sophisticated products, China wasable to penetrate markets that could support higher wages, and consequently, higher incomes.Evidence in support of this theory is shown in the work of Rodrik (2006), who estimated the level of sophisticationof Chinese exports by calculating the average income per capita of the countries exporting the same products thanChina. Rodrik (2006) showed that even as early as 1992, when China’s GDP per capita at PPP and constant priceswas just USD 1,844, it exported products associated with an average level of income that was roughly of USD 13,500,which corresponds to China’s level of income in 2015. Rodrik (2006) argued that this unusually high level of exportsophistication fueled China’s great economic expansion.Further evidence supporting the idea that the sophistication of China’s exports is a factor explaining China’s2apid economic expansion is contained in the literature on economic complexity (Hidalgo and Hausmann, 2009;Tacchella et al., 2012; Hausmann et al., 2014; Hidalgo, 2015), which has focused on developing measures of a coun-try’s export sophistication that avoid the circularity of using income data. The consensus of this literature is also thatcountries with a relatively high level of economic complexity–countries that export a diverse set of non-ubiquitousgoods–grow, on average, faster than countries with a similar level of income but lower levels of economic complexity.If China’s economy expanded because it succeeded in the export of sophisticated products, then the question is:how did China learn to produce products of increasing levels of sophistication? Here the literature provides twoanswers. One is that economies learn by leveraging the capabilities embodied in related industries. That is economiesthat are good at producing shirts, would have an easier time learning how to produce pants, coats, and socks. Theother idea is that economies learn from neighboring regions. That is the probability that a province would succeed atmaking shirts depends on having neighboring regions that have already developed the capacity to produce shirts.The view that economic development is a collective learning process is found repeatedly in the work of develop-ment economists, evolutionary economists, economic geographers, and in the literature of economic clusters.Evolutionary economists, going back to the seminal work of Nelson and Winter (1982), have pushed the idea thateconomies learn by accumulating capabilities in networks of individuals and ﬁrms. In this strand of literature, capa-bilities are explicit and tacit knowledge (Polanyi, 1958; Collins, 2010) that ﬁrms embody in routines and proceduresthat make the learning process deeply path dependent. The ability of a ﬁrm to accumulate these capabilities depends,among other factors, on the institutional environment of where the ﬁrm is located (Saxenian, 1996), the levels of trustin the population (Fukuyama, 1995), the ﬁrms’ organizational structure (Powell, 1990), the dynamic capacity of a ﬁrmto learn (Teece and Pisano, 1994), the social networks where the economy is embedded (Granovetter, 1985), and ofcourse, on the existence of related ﬁrms and neighboring regions that have already accumulated the right capabilities.It is not surprising, therefore, that much work has gone into understanding the channels that facilitate the abilityof economies to learn. In broad strokes, this literature has focused on two learning channels: inter-industry learning,which has been studied extensively, and focuses on how ﬁrms learn from related industries that are already in theirregion; and inter-regional learning, which has been much less studied, and focuses on the learning that takes placeacross geographic boundaries. 3nter-industry learning has been studied at the international, regional, and ﬁrm level. This literature has focusedon testing how the existence of related industries increases the probability that an industry will enter a region, exit aregion, or became more productive.At the international level, Hidalgo et al. (2007) and Hausmann et al. (2014) have used export data to show thatthe probability that a country will develop comparative advantage in a new product depends strongly on the numberof related products that it already exports. To establish this stylized fact Hidalgo et al. (2007) introduced the ideaof the product space , a network connecting products that countries are likely export in tandem. Using this networkrepresentation it is easy to score each product that a country does not yet export based on the number of relatedproducts that the same country is already exporting. This score, called density , is a statistically good predictor of theprobability that a country will develop comparative advantage in a speciﬁc product in the future.At the regional level, people have used data on input-output relationships, labor ﬂows, and the product portfolios ofmanufacturing plants to measure industrial relatedness (Boschma , 2017; Delgado et al., 2016; Boschma et al., 2012;Semitiel-Garcia and Noguera-Mendez, 2012; Boschma and Iammarino, 2009; Frenken et al., 2007). Ne ﬀ ke et al. (2011)used data on the product portfolios of manufacturing plants in Sweden to connect industries and showed respectivelythat the probability that an industry will enter, or exit, a region, increases, or decreases, with the number of related in-dustries present in it. Delgado et al. (2014) used data from the US Cluster Mapping Project to show that ﬁrms locatedin clusters of related industries tend to experience higher patenting and employment growth. Delgado et al. (2010)also shows that clusters tend to enhance entrepreneurship, since start-up industries located in clusters tend to growfaster.At the ﬁrm level, Teece (1980) has argued that coherent multi-product enterprises (ﬁrms that produce a diverseportfolio of related products) are an e ﬃ cient way to organize economic activity when the development of products re-quires re-utilizing proprietary knowhow and specialized and indivisible physical assets (Teece, 1982). More recently,empirical work like that of Ne ﬀ ke and Henning (2013) has leveraged labor-ﬂow data to connect related industries, byarguing that industries that are more likely to exchange labor are related in terms of the skills that they require. Usingtheir skill-relatedness metric, they found that ﬁrms are more likely to diversify their product portfolios to include theproducts that were being produce by related industries (Ne ﬀ ke and Henning, 2013), adding to the evidence that ﬁrm4iversiﬁcation is also path dependent and coherent (Teece et al., 1994). By studying empirically the e ﬀ ects of ﬁvedi ﬀ erent dimensions of agglomeration on the survival chances of new entrepreneurial ﬁrms in China, Howell et al.(2016) found that increasing local related variety has a stronger positive e ﬀ ect on new ﬁrm survival than other typesof agglomeration.The inter-regional learning literature, on the other hand, is sparser, and it focuses on how economies learn fromneighboring regions instead of similar industries. At the international level, Bahar et al. (2014) showed that the proba-bility that a country will start exporting a product increases signiﬁcantly if that country shares a border with a neighborthat is already a successful exporter of that product, even after discounting the e ﬀ ects of product relatedness capturedin the product space. At the regional level, Boschma et al. (2016) used data from the United States to show that regionsare more likely to develop industries that are present in neighboring regions. Acemoglu et al. (2015) studied the directand spillover e ﬀ ects of local state capacity in Colombia, and found that spillover e ﬀ ects are sizable, accounting forabout 50 percent of the quantitative impact of an expansion in local state capacity. At the ﬁrm level, Holmes (2011)studied the geographic expansion of Wal-Mart stores in the US, and found that locations of new Wal-Mart stores tendto be in close geographic proximity to regions where Wal-Mart already had a high density of stores.Yet, one of the issues that limits this literature is the sparse causal evidence supporting both inter-industry learningand inter-regional learning. One e ﬀ ort in this direction is the work of Ellison et al. (2010), who explored data fromUS manufacturing industries to check the e ﬀ ect of the cost of moving goods, people, and ideas on the co-location ofindustries, i.e. inter-industry learning. To reduce concerns of reverse causality, Ellison et al. (2010) used data fromUK industries and from US areas as instruments.In this paper, we contribute to this expanding body of literature by studying the role of collective learning in thegreat economic expansion experienced by China between 1990 and 2015. First, we show that the probability that anew industry will grow in a province increases with the number of related industries present in it, a fact supportingtheories of inter-industry learning. Next, we show that the probability that a new industry will grow in a provincealso increases with the number of neighboring provinces in which that industry is already present, a fact that supportsinter-regional learning theories. Moreover, we ﬁnd that both learning channels work together but exhibit diminishingreturns, meaning that when one learning channel is su ﬃ ciently active (inter-industry or inter-regional) the marginal5ontribution of the other one is reduced (the channels are substitutes). Finally, we address endogeneity concerns byusing the introduction of high-speed rail among Chinese provinces to isolate the e ﬀ ects of inter-regional learning. Theintroduction of high-speed rail is an instrument that a ﬀ ects the travel time between provinces, but not the similarityamong industries. Our di ﬀ erences-in-di ﬀ erences (DID) results show that, after the introduction of high speed rail, thepairs of provinces connected by rail became more similar in terms of their industrial structure. Also, our results showa signiﬁcant increase in productivity for industries located in provinces that became connected by high-speed rail toother provinces where that industry was present. Together, these results add to the evidence that China’s economicexpansion beneﬁted from inter-industry and inter-regional learning.

2. Data

We use data from China’s stock market extracted from the RESSET Financial Research Database, which is pro-vided by Beijing Gildata RESSET Data Tech Co., Ltd. , a leading provider of economic and ﬁnancial data in China.Our data set covers 1990-2015, a period when China achieved rapid economic development. This data set providessome basic registration and ﬁnancial information of publicly listed ﬁrms in Chinese stock exchanges, such as listingdate, delisting date, registered address, industry category, yearly revenue, and number of employees. Although thenumbers of newly listed and delisted ﬁrms in each year ﬂuctuates, the overall number of ﬁrms increases almost lin-early with time (see Figure S1). The registered addresses of ﬁrms cover 31 provinces in China. All these listed ﬁrmsin our data set are aggregated into two levels, 18 categories at the sector level and 70 subcategories at the sub-sectorlevel. The aggregation is based on the “Guidelines for the Industry Classiﬁcation of Listed Companies” issued by theChina Securities Regulatory Commission (CSRC) in 2011 (see online Appendix for details). CSRC category andCSRC subcategory codes as well as their associated industry names are shown in Figure S2. Moreover, to measure theproductivity of the industries in a province, we use the total revenue of ﬁrms divided by the total number of employeesin that industry in that province. http: // http: // .2. Distance and macroeconomic indicators To study inter-regional learning, we use geographic distance as a measure of physical proximity between regions.The geographic distance ( D i , j ) between provinces i and j is deﬁned as the distance between the capital cities of twoprovinces (in China capital cities are likely to be the largest city in a province). Together, we collect macroeconomicdata at the province-level, including Gross Domestic Product per capita (GDP per capita), resident population, totalvalue of imports and exports, urban area, and total area, from the China Statistical Yearbook, which is published bythe National Bureau of Statistics of China . As an urbanization metric, we use the share of urban area in a province.These macroeconomic indicators cover the 1990-2015 period and are available for the 31 provinces in China. Briefdescriptions and summary statistics of these distance and macroeconomic indicators can be found in Table S1.

3. Results

We organize our results into four sections. First, we explore inter-industry learning by constructing a networkof related industries, or industry space, and explore how the probability that an industry will emerge in a provinceincreases with the number of related industries already present in it. Next, we explore inter-regional learning by usinggeographic data to study how the probability that an industry will emerge in a province increases with the presence ofthat industry in neighboring provinces. Then, we combine both geographic and industrial similarity data to study theinteraction between inter-industry and inter-regional learning. Our results shows that learning exhibits diminishingreturns, meaning that when one learning channel (inter-regional or inter-industry) is su ﬃ ciently active, the otherchannel does not contribute as much (they are substitutes). Finally, we use the introduction of high-speed rail betweenprovinces as an instrument to gather evidence in support of the inter-regional learning hypothesis. Our di ﬀ erences-in-di ﬀ erences (DID) estimate shows that the provinces connected by high speed rail experienced a signiﬁcant increasein their industrial similarity, and also, that industries located in provinces that were connected by rail increased theirproductivity when these rail connections connected them to other provinces where the same industry was present. http: // .1. Inter-industry learning We explore how the probability that an industry will appear in a province is a ﬀ ected by the number of relatedindustries already present in it, by ﬁrst constructing a network of industries, or industry space, and then use thisnetwork to see if the probability that an industry will emerge in a province increases with the number of relatedindustries that are already present in it.First, we connect provinces and industries by building a “province-industry” bipartite network, where the weightof link x i ,α is the number of ﬁrms in province i that operate in industry α (see Figure S3 for illustration). Goingforward, we use Greek letters for indices indicating industries and Roman letters for indices indicating provinces.Next, we estimate the proximity φ α,β between industries α and β by calculating the cosine similarity between x i ,α and x i ,β across all provinces. Following Hidalgo et al. (2007), we assume the co-location of industries to be animperfect proxy of their similarity, since pairs of industries that tend to co-locate are more likely to require similarcapabilities (whether these are human capital, institutional factors, logistic facilities, or geographic resources) thanpairs of industries that do not tend to be co-located. Formally, we let x i ,α, t and x i ,β, t be the number of ﬁrms in province i that respectively operate in industries α and β at year t . Then, the proximity φ α,β, t is given by: φ α,β, t = P i x i ,α, t x i ,β, t pP i ( x i ,α, t ) pP i ( x i ,β, t ) . (1)Figure 1 shows China’s industry space for the year 2015 (see Figure S4 and online Appendix for details on thevisualization methods used). We note that China’s industry space exhibits both, a core-periphery and a dumbbellstructure, with a tightly knit core of manufacturing industries (on the left), and another tightly knit core of service andinformation related activities (on the right). This dumbbell structure is also visible when looking at the hierarchicallyclustered matrix of industrial proximities (see Figure S5). In agreement with previous ﬁndings, which used dataon products instead of industries (Hidalgo et al., 2007), we ﬁnd extractive and agricultural activities to occupy theperiphery of the industry space.Next, we look at how the structure of the industry space shapes the economic diversiﬁcation paths of Chineseprovinces using three methods: a network visualization, a graphical method, and a multivariate statistical model.First, we deﬁne an industry to be present in a province if that province has revealed comparative advantage inthat industry. We deﬁne the revealed comparative advantage RCA i ,α, t for province i in industry α at year t following8 Node Color (industry category)Link Weight(proximity) ฀ > 0.90> 0.85> 0.80< 0.80 ฀ AgricultureMiningManufacturingElectricityConstructionRetail

Transport

CateringITFinanceEstateLeasingScientific ResearchUtility ManagementResident Services Health WorkEntertainmentDiversified Industries200515030

Node Size (

Figure 1: Network representation of China’s industry space in 2015. Nodes (circles) represent industries. Links connect industries that are likely tolocate in the same province. Nodes are classiﬁed into 70 subcategories and colored according to 18 sectors. The size of each node is proportionalto the number of ﬁrms in that industry. The color and weight of links correspond to the proximity value ( φ ) between two industries. Balassa (1965). That is, we use the ratio between the observed number of ﬁrms operating in industry α in province i and the expected number of ﬁrms of that industry in that province. Formally, the revealed comparative advantage RCA i ,α, t is given by: RCA i ,α, t = x i ,α, t P α x i ,α, t , P i x i ,α, t P α P i x i ,α, t , (2)where x i ,α, t is the number of ﬁrms in province i that operate in industry α at year t . We say industry α is present inprovince i at year t if RCA i ,α, t ≥ ZhejiangShanghaiHebeiBeijing

Figure 2: Evolution of China’s provincial industrial structure between 1992 and 2015. Four illustrated provinces are Beijing, Hebei, Shanghai, andZhejiang. Black circles indicate industries in which a province has revealed comparative advantage (

RCA ≥ of economic development.Next, we formalize this observation by constructing an indicator for each industry and province, counting thenumber of related industries that are already present in that province (i.e., RCA ≥ density (Hidalgo et al., 2007; Boschma et al., 2013, 2016). Here, to avoid confusion with a similar indicatorwe will introduce later for neighboring provinces, we call this estimator the density of active related industries ( ω ).10 .0 0.1 0.2 0.3 0.4 0.50.000.050.100.150.200.250.30 F r equen cy Density of Active Related Industries

New Industry in 5 YearsNo New Industry in 5 Years P r obab ili t y o f N e w I ndu s t r y i n5 Y ea r s Density of Active Related Industries

A B

Figure 3: Inter-industry learning. (A) Distribution of the density of active related industries for each pair of provinces and industries. The pinkdistribution focuses only on pairs of provinces and industries that developed revealed comparative advantage in the next ﬁve years. The bluedistribution is for the pairs of industries and provinces that did not develop revealed comparative advantage. The mean of the pink distribution issigniﬁcantly larger than that of the blue distribution (ANOVA p-value = . × − ). (B) Probability that a new industry will appear in a province asa function of the density of active related industries ( ω ). Bars indicate average values and error bars indicate standard errors. Results show averagesfor 2001-2015 using ﬁve-year intervals. In all calculations, densities were calculated for the base year. Formally, the density of active related industries ( ω i ,α, t ) for industry α in province i at year t is given by: ω i ,α, t = P β φ α,β, t U i ,β, t P β φ α,β, t , (3)where U i ,β, t takes the value of 1 if province i has revealed comparative advantage in industry α at year t (i.e., RCA i ,β, t ≥

1) and 0 otherwise. Density is simply an indicator telling us, for each industry, what is the fraction of related industriesthat are already present in that province.Next, we look at the probability that industry α would appear in province i as a function of the density of activerelated industries in that province. To reduce noise, we follow Bahar et al. (2014) and restrict the appearance of newindustries to two conditions: a backward condition, which requires an industry to have RCA below 1 during the twoyears prior to the beginning of the period; and a forward condition, which requires an industry to sustain RCA above1 for the two years after the end of the period.Figure 3A shows the frequency of densities of active related industry for pairs of industries and provinces thatdeveloped revealed comparative advantage (in pink) and that did not develop revealed comparative advantage (in blue)in a ﬁve-year period. The distributions show that–on average–the density of an industry in the provinces that developed11evealed comparative advantage in that industry ﬁve years later is signiﬁcantly larger (ANOVA p-value = . × − )than in those that did not (see Figure S6 for additional robustness check).Figure 3B looks at the probability that a province will develop revealed comparative advantage in an industryas a function of the density of active related industries in that province ﬁve years ago. The increasing and convexrelationship shows that the probability that an industry will develop revealed comparative advantage in a provinceincreases strongly with the density of active related industries. To reduce noise, we use a ﬁxed industry space ( φ α,β )in year 2015 in Eq. (3), but we note that our results are robust (see Figure S7) when we use a time-varying industryspace ( φ α,β, t ), where the industrial proximity is calculated using data only from previous years.Finally, we use a multivariate probit model to estimate how the probability that a province will develop revealedcomparative advantage, or keep revealed comparative advantage in an industry, changes with the density of activerelated industries. We separate our dataset into two sets: one set containing all province-industry pairs that did not haverevealed comparative advantage (that could potentially be developed), and another set containing all pairs of provincesand industries that had comparative advantage (and that could lose it). Then, we set up two probit regressions, oneexplaining the probability that a province without RCA in an industry will develop RCA in that industry in the nextﬁve years, and the other explaining the probability that a province with RCA in an industry will keep RCA in thatindustry. In both of these regressions we control for the number of provinces with revealed comparative advantagein that industry and the number of industries with revealed comparative advantage in that province. Our empiricalspeciﬁcation is: U i ,α, t + = β + β ω i ,α, t + β M α, t + β N i , t + µ t + ε i ,α, t , (4)where U i ,α, t + ( U i ,α, t ) takes the value of 1 if RCA i ,β, t + ≥ RCA i ,β, t ≥

1) and 0 otherwise, ω i ,α, t is the density of activerelated industries for industry α in province i at year t , M α, t = P i U i ,α, t is the number of provinces where that industryhas revealed comparative advantage, N i , t = P α U i ,α, t is the number of industries with revealed comparative advantagein that province, and ε i ,α, t is the error term. The regression equation includes the year-ﬁxed e ﬀ ects, µ t , to control forany time-varying characteristics of provinces and industries.The regression coe ﬃ cient β captures the impact of the density of active related industries in the probability ofdeveloping revealed comparative advantage in a new industry (see columns (1)-(3) of Table 1) and in the probability12 able 1: Probit regressions for inter-industry learning.Independent Variables Probit ModelDeveloping RCA in a Five-year Period Keeping RCA in a Five-year Period(1) (2) (3) (4) (5) (6)Density of Active Related Industries 3.8844*** 4.2084*** 11.510*** -0.5753** -1.7392*** 15.266***(0.1622) (0.1661) (0.3826) (0.2266) (0.2665) (1.1658)Number of Active Provinces in Industry 0.0559*** 0.0624*** -0.0740*** -0.0834***(0.0028) (0.0029) (0.0059) (0.0062)Number of Active Industries in Province -0.1348*** -0.3101***(0.0063) (0.0193)Observations 25713 25713 25713 6837 6837 6837Pseudo R Notes : Probit regressions modeling the probability of developing a new industry, or keeping an industry, in a Chinese province, as a function of thedensity of active related industries in a province, the number of provinces active in an industry, and the number of industries active in a province.Data is for the 2001-2015 period. Probit regressions include year-ﬁxed e ﬀ ects. Signiﬁcant level: ∗ p < . ∗ ∗ p < .

05, and ∗ ∗ ∗ p < . of keeping revealed comparative advantage in an industry (see columns (4)-(6) of Table 1). In all speciﬁcations weﬁnd the density of active related industries to be a strong, positive, and signiﬁcant predictor of both, the probability ofdeveloping a new industry and keeping an industry in a Chinese province. In all cases, by controlling for the numberof active industries in a province and the number of provinces that are active in an industry we show that our ﬁndingsare not just a reﬂection of the industrial diversity of a province or the ubiquity of an industry. Next we explore our data in search for evidence in support of inter-regional learning. Once again, we divide ouranalysis into three sections, a data visualization (for illustrative purposes), a graphical method, and a multivariatestatistical model.Figure 4 shows the spatial evolution of the presence of industries in Chinese provinces using data on the revealedcomparative advantage of four industries (Chemical Products Manufacturing, Pharmaceuticals, Electric MachineryManufacturing, and Wholesale) in each province between 1992 and 2015 (see Figure S8 for an equivalent chart usingthe number of ﬁrms). The saturation of the color indicates the natural logarithm of the revealed comparative advantageof that province in that industry ( ln ( RCA +

51: Wholesale38: Elec. Machinery27: Pharmaceutical26: Chemical l n ( RC A + ) l n ( RC A + ) l n ( RC A + ) l n ( RC A + ) l n ( RC A + ) l n ( RC A + ) Figure 4: Evolution of revealed comparative advantage of provinces in China between 1992 and 2015. Four illustrated industries are ChemicalProducts Manufacturing Industry, Pharmaceutical Industry, Electric Machinery Manufacturing Industry, and Wholesale Industry (the keys of labelscorrespond to Figure S2. The saturation of the color indicates the value of ln ( RCA + more similar industrial structure. To do so, we measure the industrial similarity of a pair provinces using the cosinesimilarity of the vectors summarizing the revealed comparative advantage of industries in each province. Formally,let y i ,α, t = ln ( RCA i ,α, t +

1) and y j ,α, t = ln ( RCA j ,α, t + ϕ i , j , t between provinces i and j at year t will be given by: ϕ i , j , t = P α y i ,α, t y j ,α, t pP α ( y i ,α, t ) pP α ( y j ,α, t ) . (5)14

50 750 1250 1750 2250 2750 32500.200.250.300.350.400.450.0 0.1 0.2 0.3 0.4 0.5 0.6 0.70.000.050.100.150.200.25 F r equen cy Industrial Similarity

Neighboring ProvincesNon-Neighboring Provinces

A B I ndu s t r i a l S i m il a r i t y Geographic Distance (km)

Pearson's r = -0.32 Figure 5: (A) Distribution of industrial similarity between pairs of neighboring provinces (in pink) and non-neighboring provinces (in blue). Thered and blue curves are, respectively, normal ﬁts for the distributions for neighboring and non-neighboring province pairs. (B) Industrial similaritybetween all pairs of provinces as a function of their geographic distance. Bars correspond to the average industrial similarity ( ϕ ) of pairs ofprovinces at that distance and error bars correspond to standard errors. The blue dash line represent a linear ﬁt of the unbinned data. Pearson’scorrelation between industrial similarity and geographic distance is r = − . Figure 5A shows the distribution of the industrial similarities ( ϕ i , j ) in 2015 for both, pairs of neighboring provinces(in pink) and pairs of non-neighboring provinces (in blue). We ﬁnd that the industrial similarity of neighboringprovinces is signiﬁcantly larger than the similarity of non-neighboring provinces (ANOVA p-value = . × − ). Fig-ure 5B shows the industrial similarity ( ϕ i , j ) as a function of geographic distance ( D i , j ). Once again, we see that pairsof provinces in close physical proximity tend to be more similar than distant pairs of provinces (see Figure S9 forequivalent charts using other distance and travel time measures).Next, we formalize these observations by constructing an indicator, for each province, of the number of neighbor-ing provinces that have developed revealed comparative advantage in each industry. We call this estimator the density of active neighboring provinces ( Ω ). For province i in industry α at year t , the density of active neighboring provinces Ω i ,α, t is given by: Ω i ,α, t = X j U j ,α, t D i , j ,X j D i , j , (6)where D i , j is the geographic distance between provinces i and j , and the binary variable U j ,α, t takes the value of 1 if RCA j ,α, t ≥ .0 0.1 0.2 0.3 0.4 0.5 0.60.000.050.100.150.200.25 F r equen cy Density of Active Neighboring Provinces

New Industry in 5 YearsNo New Industry in 5 Years P r obab ili t y o f N e w I ndu s t r y i n5 Y ea r s Density of Active Neighboring Provinces

A B

Figure 6: Inter-regional learning. (A) Distribution of the density of active neighboring provinces for each pair of provinces and industries. Thepink distribution focuses only on pairs of provinces and industries that developed revealed comparative advantage in the next ﬁve years. The bluedistribution is for the pairs of industries and provinces that did not develop revealed comparative advantage. The mean of the pink distributionis signiﬁcantly larger than that of the blue distribution (ANOVA p-value = . × − ). (B) Probability of a province developing comparativeadvantage in an industry as a function of the density of active neighboring provinces ﬁve years ago. Bars indicate average values and error barsindicate standard errors. Results show averages for 2001-2015 using ﬁve-year intervals. Once again, we use the density estimator ( Ω ) to explore whether the presence of a new industry in neighboringprovinces increases the probability that this industry will appear in a province in the future. To perform this analysis,we estimate the density of active neighboring provinces ( Ω ) for each province and industry in a base year and lookat the new industries that appear in that province ﬁve years later. To reduce noise, we follow Bahar et al. (2014) andrestrict the presence of new industries to two conditions: a backward condition, asking an industry to have an RCAbelow 1 for two years before the beginning of the period; and a forward condition, asking an industry to be presentwith RCA above 1 for two years after the end of the period.Figure 6A compares the distribution of densities ( Ω ) for industry-province pairs that developed revealed com-parative advantage in an industry in a ﬁve-year period (in pink) and those that did not (in blue). We ﬁnd that theaverage density of the province-industry pairs that developed revealed comparative advantage in a ﬁve-year period issigniﬁcantly larger than the province-industry pairs that did not (ANOVA p-value = . × − ).Figure 6B shows the probability that a province will develop revealed comparative advantage in an industry as afunction of the density of active neighboring provinces ( Ω ). Once again, we ﬁnd an increasing and convex relationship16 able 2: Probit regressions for inter-regional learning.Independent Variables Probit ModelDeveloping RCA in a Five-year Period Keeping RCA in a Five-year Period(1) (2) (3) (4) (5) (6)Density of Active Neighboring Provinces 1.5393*** 1.5621*** 1.6969*** -1.4079*** -1.7160*** 0.5836*(0.0781) (0.0782) (0.2116) (0.1317) (0.1332) (0.3311)Number of Active Industries in Province 0.0404*** 0.0402*** -0.0555*** -0.0660***(0.0025) (0.0025) (0.0049) (0.0051)Number of Active Provinces in Industry -0.0053 -0.1045***(0.0075) (0.0132)Observations 25713 25713 25713 6837 6837 6837Pseudo R Notes : Probit regressions modeling the probability of developing a new industry, or keeping an industry, in a Chinese province, as a function of thedensity of active neighboring provinces in an industry, the number of industries active in a province, and the number of provinces active in anindustry. Data is for the 2001-2015 period. Probit regressions include year-ﬁxed e ﬀ ects. Signiﬁcant level: ∗ p < . ∗ ∗ p < .

05, and ∗ ∗ ∗ p < . showing that the probability that a province will develop revealed comparative advantage in an industry increasesstrongly with the fraction of active neighboring provinces in that industry. These results are robust (see Figure S10)when we use other distance metrics in Eq. (6).Finally, we use a multivariate probit model to estimate how the probability that a province will develop revealedcomparative advantage, or keep revealed comparative advantage in an industry, is a ﬀ ected by the number of activeneighboring provinces. We use this model to control for the number of industries in which that province alreadyhas revealed comparative advantage, and the number of provinces that already have comparative advantage in thatindustry. We estimate the following empirical speciﬁcation: U i ,α, t + = β + β Ω i ,α, t + β N i , t + β M α, t + µ t + ε i ,α, t , (7)where Ω i ,α, t is the density of active neighboring provinces for industry α and province i at year t , and all other variablesare deﬁned as the same in Eq. (4).Table 2 presents the results of our probit regressions. Once again, we divide our dataset into two sets: onecontaining all pairs of provinces and industries that do not have revealed comparative advantage (that we use topredict the ones that will develop RCA), and the other, with all province industry pairs with revealed comparativeadvantage (that we use to predict the ones who can sustain RCA).Columns (1)-(3) of Table 2 show the density of active related industries is a positive and signiﬁcant predictor17 .00 0.10 0.20 0.30 0.40 0.50 0.60 0.700.050.100.150.200.250.300.350.400.45 P r obab ili t y o f N e w I ndu s t r y i n Y ea r s D en s i t y o f A c t i v e R e l a t ed I ndu s t r i e s Density of Active Neighboring Provinces

Figure 7: Joint probability of a province developing revealed comparative advantage in a new industry in a ﬁve-year period given the density ofactive neighboring provinces ( Ω ) in horizontal-axis and the density of active related industries ( ω ) in vertical-axis. of the industries that a province will develop in the future, suggesting that provinces are more likely to develop anindustry when they have neighbors that are competitive in that industry. The e ﬀ ect of active neighboring provinceson sustaining RCA in an industry, however, are not as clear (see columns (4)-(6) of Table 2). The bi-variate e ﬀ ectsis negative, but becomes positive after controls. We interpret this as evidence of a tension between competition andlearning, since an active neighboring province is a source of learning when that province does not have an industry,but it is also a source of competition when that province has developed that industry. In all cases, by controlling forthe number of active industries in a province and the number of provinces that are active in an industry we show thatour ﬁndings are not just a reﬂection of the industrial diversity of a province or the ubiquity of an industry. In the previous two sections we provided evidence supporting inter-regional and inter-industry learning in China’seconomic development. But do inter-regional and inter-industry learning work together? Or are they substitutes? Inthis section we combine both channels using graphical statistical methods and multivariate statistical models.First, we calculate the joint probability that a new industry will emerge in a province as a function of both thedensity of active neighboring provinces ( Ω ) and the density of active related industries ( ω ). All ﬁlters and deﬁnitionsare equivalent to those used in the previous two sections. In agreement with our previous results, in Figure 7 we ﬁnd18hat the probability that an industry will appear in a province in a ﬁve-year period increases with both, the densityof active neighboring provinces ( Ω at horizontal-axis) and the density of active related industries ( ω at vertical-axis).The result is robust when using other density measures (see Figure S11).To explore the interaction between these two learning channels we use a probit model where the dependent variable J i ,α, t + counts the number of provinces that developed comparative advantage in an industry. Once again we considera backward and forward condition to reduce noise. Formally, J i ,α, t + = U i ,α, t & U i ,α, t − & U i ,α, t − = U i ,α, t + & U i ,α, t + & U i ,α, t + =

1. The empirical speciﬁcation is given by J i ,α, t + = β + β Ω i ,α, t + β ω i ,α, t + β Ω i ,α, t ω i ,α, t + µ t + ε i ,α, t , (8)where Ω i ,α, t is the density of active neighboring provinces, ω i ,α, t is the density of active related industries, Ω i ,α, t ω i ,α, t is the interaction term of the two densities, µ t are the year-ﬁxed e ﬀ ects, and ε i ,α, t is the error term.Table 3 presents the results of the probit regressions (see Table S3 for summary statistics of regression variables).Column (1) shows the basic regression considering the density of active neighboring provinces ( Ω ) and the density ofactive related industries ( ω ). We ﬁnd both e ﬀ ects are jointly signiﬁcant. Column (2) adds an interaction term betweenthe two densities ( Ω ω ). Once we add the interaction term we ﬁnd the individual coe ﬃ cients for both densities ( Ω and ω ) to increase, while the interaction term is negative and signiﬁcant. This indicates the presence of diminishingreturns, meaning that the partial e ﬀ ect of each learning channel is reduced when the second channel is present. Thatis, when one learning channel is su ﬃ ciently active (inter-industry or inter-regional), the marginal contribution of theother one is reduced. Together, we ﬁnd that the inter-industry learning has slightly stronger e ﬀ ect in activating newindustries as suggested by its larger regression coe ﬃ cient.To check the robustness of our results we consider alternative deﬁnitions for both, the density of active neighboringprovinces ( Ω ) and the density of active related industries ( ω ). In columns (3) and (4) of Table 3 we repeat the exerciseusing simply the ratio of active neighboring provinces and the ratio of active related industries as independent variables(see online Appendix for details). This is equivalent to calculating both densities ( Ω and ω ) using simple proportionsinstead of weighted averages. Once again, we ﬁnd both e ﬀ ects are signiﬁcant and there are diminishing returns tothe addition of an alternative learning channel. Column (5) uses just the number of active neighboring provincesand the number of active related industries, instead of densities or ratios. We conﬁrm the same results, although the19 able 3: Interaction between inter-industry learning and inter-regional learning.New Industries in a Five-year Period Probit Model Using both Densities and Their Alternative Deﬁnitions(1) (2) (3) (4) (5) (6)Density of Active Neighboring Provinces 1.3092*** 4.3405***(0.0807) (0.2421)Density of Active Related Industries 3.7163*** 6.2435***(0.1713) (0.2616)Interaction Term 1 -11.8437***(0.9136)Ratio of Active Neighboring Provinces 0.5474*** 0.6643***(0.0499) (0.0678)Ratio of Active Related Industries 0.7802*** 0.8502***(0.0374) (0.0472)Interaction Term 2 -0.3701**(0.1572)Number of Active Neighboring Provinces 0.1739***(0.0150)Number of Active Related Industries 0.2093***(0.0125)Interaction Term 3 -0.0414***(0.0065)Number of Neighboring Provinces 0.0049(0.0103)Number of Related Industries 0.0178**(0.0090)Interaction Term 4 0.0010(0.0019)Observations 25713 25713 25713 25713 25713 25713Pseudo R Notes : The regressions consider both e ﬀ ects of inter-regional learning and inter-industry learning. Data are for the 2001-2015 period. The probitregressions include the year-ﬁxed e ﬀ ects. Signiﬁcant level: ∗ p < . ∗ ∗ p < .

05, and ∗ ∗ ∗ p < . explanatory power of this model is smaller than the one involving densities, meaning that the use of weighted averagesto calculate densities contributes relevant information. Finally, column (6) presents a negative control: a model usingthe number of neighboring provinces and the number of related industries, no matter whether these are active or not.In this case, the model loses almost all its explanatory power and the e ﬀ ects are small, meaning that our results comefrom having active neighboring provinces and active related industries, but not from just having many neighboringprovinces or just having many related industries. Finally, we study how the introduction of high-speed rail a ﬀ ected inter-regional learning using a di ﬀ erences-in-di ﬀ erences (DID) analysis. The introduction of high-speed rail is an adequate instrument because it reduces thebarriers to inter-regional learning but should not a ﬀ ect inter-industry learning. In this section we check the e ﬀ ects ofinter-regional learning ﬁrst in terms of industrial similarity (measured by looking at the set of industries present in a20rovince), and second, in terms of productivity (by looking at the increase in productivity of industries in provincesconnected by high-speed rail)During China’s great economic expansion commercial train service was improved through several “speed-up”campaigns. These took the speed of trains from an average of only 48 km / h (in 1990s) to more than 300 km / h in thebest cases (Jiao et al., 2014). By 2015, over 90 Chinese cities were connected by high-speed rail (Lin et al., 2015),and as of September 2016, China had the world’s longest high-speed rail network, with over 20,000 km of track, alength that is longer than the rest of the world’s high-speed rail tracks combined (Cao et al., 2013).The introduction of high-speed rail reduced travel time among provinces, encouraging face-to-face interactionsand potentially promoting learning among provinces (Zheng and Kahn, 2013). Face-to-face interactions are consid-ered important for learning, since they are a signiﬁcant and e ﬀ ective way to build trust and to share complex ideas,even in the era of online communication technologies (Storper and Venables, 2004). The introduction of transport, orreductions in transportation costs, has been used in the past as instruments to test the e ﬀ ect of cost on the social inter-actions. For instance, Catalini et al. (2016) used the introduction of Southwest airlines, a discount airline in the US,to test whether reductions in ticket prices of direct ﬂights between U.S. cities increased collaboration among scholarsfrom the universities connected by these cheaper ﬂights.Similarly to Catalini et al. (2016) we address endogeneity concerns using the di ﬀ erences-in-di ﬀ erences (DID)method and the introduction of high-speed rail as an instrument. Because the introduction of high speed rail doesnot a ﬀ ect the similarity and productivity of industries within a province, this instrument help us isolate the e ﬀ ects ofinter-regional learning from inter-industry learning.The DID method requires two groups: treatment and control. In our DID analysis, pairs of provinces belong to thetreatment group if they are connected by high-speed rail in 2015, otherwise they belong to the control group. Althoughthere are many rounds of “speed-up” campaigns, we consider only the period between 2004 and 2014, since this allowsus to capture the construction of numerous railroads (and hence obtain a larger sample), instead of observing just a few.The introduction of high-speed rail was identiﬁed using the Google Maps API in 2015 considering the accessibilitybetween capital cities of provinces through high-speed rail passenger trains (see online Appendix for details).We justify the use of the DID method as an identiﬁcation strategy using two observations. On the one hand, the21onstruction of high-speed rail between provinces should be close to random (Qin, 2016), at least with respect tothe dependent variable (industrial similarity and productivity in our case and productivity of ﬁrms), and with respectto province level characteristics such as the levels of economic development and urbanization (Bertrand et al., 2004;Besley and Case, 2000). This is because the construction of high speed rail was driven by political reasons (and notto connect provinces with similar industrial structures). For example, the “Go West” plan connected the coast withChina’s Far West. There is also the “Silk Road Economic Belt” plan (Albalate and Bel, 2012; Rolland, 2015), andplans to connect China with South East Asia (Garver, 2006). The construction of rail, therefore, can be seen as aquasi-experiment Catalini et al. (2016); Qin (2016). In fact, Qin (2016) pointed out that the introduction of high-speed rail in China can be treated as a quasi-natural experiment because most new high-speed rails were implementedon existing railway lines instead of new railways.On the other hand, the DID method is justiﬁed when the pre-trend of the dependent variable on the control andtreatment groups is similar. Our data satisﬁes this condition prior to year 2005 (see Figure 8A for industrial similar-ity). To demonstrate this, we perform the event study by running the following ordinary least-squares (OLS) linearregression model using data between 1997 and 2015 in order to predict the industrial similarity between provinces i and j for each year as: ϕ i , j , t = β + X k = β k ( T reat i , j ∗ { t = k } ) + ε i , j . (9)Here T reat i , j is a dummy variable denoting whether provinces i and j are connected by high-speed rail and 1 { t = k } is an event time indicator, which is equal to 1 for the year where the pair was connected by high-speed rail. In otherwords, Eq. (9) regresses the industrial similarity between pairs of provinces considering whether there is high-speedrail connecting them. Larger regression coe ﬃ cients ( β k ) tell us that the industrial similarity of the pairs of provincesconnected by high-speed rail increased with respect to those that remain unconnected.The results of Eq. (9) are shown in Figure 8A. Before the introduction of high-speed rail (1997-2005), there is notemporal trend in β k . After the introduction of high-speed rail (2005-2015), the e ﬀ ect of the treatment ( β k ) begins toincrease signiﬁcantly, meaning that the treated provinces grew more similar after high-speed rail was introduced (seeFigure S12 for additional robustness check). 22

004 2006 2008 2010 2012 2014 20160.240.260.280.300.320.340.360.380.401

995 2000 2005 2010 20150.050.060.070.080.090.100.11

Speed-Up Campaigns k m / h200 k m / h120 k m / h A v e r age I ndu s t r i a l S i m il a r i t y Year

TreatmentControlCounterfactualDID . A BB

Speed-Up Campaigns k m / h 300 k m / h120 k m / h R eg r e ss i on C oe ff i c i en t Year

Figure 8: Industrial similarity and the introduction of high-speed rail. (A) Event study results. The y -axis shows the regression coe ﬃ cient ( β k inEq. (9)) as a function of the year, after regressing the industrial similarity of pairs of provinces that were eventually connected by high-speed railagainst the entry of high-speed rail. Red lines are linear ﬁts for 1997-2005 and 2005-2015. (B) Di ﬀ erences-in-di ﬀ erences (DID) results. The y -axisis the average industrial similarity of all pairs of provinces connected by high-speed rail (in red) or not connected by high-speed rail (in blue). Thevalue of DID (in green) is 0.029, and it is statistically signiﬁcant. Vertical dash lines mark the years after speed-up campaigns, besides which theapproximate average speeds of high-speed rail are shown. Next, we validate these results using di ﬀ erences-in-di ﬀ erences and the following speciﬁcation: ϕ i , j , t = β + β ( T reat i , j ∗ A f ter t ) + β T reat i , j + β A f ter t + AX ′ + ε i , j . (10)Here, ϕ i , j , t is the industrial similarity between provinces i and j at year t , and ε i , j is the error term. T reat i , j ∗ A f ter t is the DID term, where the dummy T reat i , j denotes whether provinces i and j are a ﬀ ected by the introduction ofhigh-speed rail. A f ter t denotes whether it is before or after high-speed rail entry for each year t . The vector X denotesother control variables, which include gravity considerations, such as the di ﬀ erence between population, GDP percapita, urbanization, and trade, among province pairs.Figure 8B summarizes the results of the DID analysis studying the e ﬀ ect of high-speed rail on industrial similarity.The DID (in green) between treatment group (in red) and the expected trend from the control group (in dashed blackline) is 0.029, indicating that pairs of provinces became more industrially similar after the introduction of high-speedrail. The ﬁrst three columns of Table 4 present the results of the DID regressions while controlling for di ﬀ erencesin the level of population, GDP per capita, urbanization, and trade, among these pairs of cities (see Table S4 forsummary statistics of covariates). The regression coe ﬃ cient ( β ) of the interaction term ( T reat i , j ∗ A f ter t ) is positive23 able 4: DID regressions considering the e ﬀ ect of high-speed rail entry on the industrial similarity and the productivity of industries.Independent Variables DID Regressions Using OLS ModelIndustrial Similarity Productivity(1) (2) (3) (4) (5) (6)High-speed Rail Entry 0.0290* 0.0266* 0.0268* 98713*** 107343*** 105636***(0.0152) (0.0150) (0.0152) (27649) (27211) (26011)Treatment Group 0.0637*** 0.0565*** 0.0588*** 39135** 30463* 26796(0.0107) (0.0110) (0.0108) (16240) (17033) (17379)After Entry 0.0498*** 0.0466*** 0.0506*** 364939*** 376791*** 361501***(0.0091) (0.0091) (0.0090) (17603) (17362) (16524) ∆ Population (log) -0.0204*** -6881(0.0049) (8767) ∆ GDP per capita (log) -0.0207** 109114***(0.0081) (17389) ∆ Urbanization 0.0160*** 213686***(0.0127) (33900) ∆ Trade (log) -0.0068*** 20877***(0.0024) (4615)Observations 930 930 930 930 930 930Robust R . × . × . × Notes : Data are for the year 2004 (before high-speed rail entry) and 2014 (after high-speed rail entry). Signiﬁcant level: ∗ p < . ∗ ∗ p < . ∗ ∗ ∗ p < . and signiﬁcant, and it is robust to controls (see Table S5). These results suggest that the introduction of high-speedrail had an e ﬀ ect on the increase of industrial similarity experienced by pairs of Chinese provinces.Second, we examine the e ﬀ ect of high-speed rail on inter-regional learning by measuring the productivity ofindustries. One may worry that the level of productivity is likely to rely on the industrial structure of provinces.However, the correlation coe ﬃ cients between productivity and industrial similarity are neither high nor consistent overtime, allowing us to explore the e ﬀ ect of high-speed rail on productivity as a separate observation (see Figure S13).Similar to what we did with before, we measure the productivity density of active neighboring provinces as theaverage productivity of neighboring provinces weighted by distance. The productivity density of an industry in aprovince ( ζ i α ) tells us if industry α in province i is surrounded by provinces that are active and productive in thatindustry: ζ i ,α, t = X j ¯ p i , j ,α, t D i , j ,X j D i , j , (11)Here ¯ p i , j ,α, t is the average productivity of provinces i and j in industry α at year t , and D i , j is the geographic distancebetween provinces i and j . The productivity ¯ p of industry α in province i is its labor productivity, measured as revenueper worker, i.e., the total revenue of industry α in a province i divided by the total number of employees working in24

004 2006 2008 2010 2012 2014 20165.0x10 B P r odu c t i v i t y Density of Neighboring Province Productivity A Speed-Up Campaigns k m / h200 k m / h120 k m / h A v e r age P r odu c t i v i t y Year

TreatmentControlCounterfactualDID , Figure 9: (A) Productivity of provinces as a function of the density of neighboring province productivity ﬁve years before. Bars indicate averagevalues and error bars indicate standard errors. Results show averages for 2005-2014 using ﬁve-year intervals. (B) Average productivity of provincepairs connected with high-speed rail (treatment, in red) and without high-speed rail (control, in blue). The di ﬀ erences-in-di ﬀ erences (DID, ingreen) is CNY 98,713 ( ∼ USD 15k). Vertical dash lines mark the years after speed-up campaigns, besides which the approximate average speedsof high-speed rail are shown. that industry α in that province i .We use this density estimator ( ζ ) to explore whether industries tend to be more productive when they are inprovinces that are surrounded by neighbors that are productive in that industry. Figure 9A shows that the averageproductivity of an industry in a province increases with the productivity density of neighboring provinces. Onceagain, we ﬁnd an increasing and convex relationship.Finally, we analyze the e ﬀ ects of high-speed rail on the average productivity of the industries in a province usingdi ﬀ erences-in-di ﬀ erences. Like before, we check the pre-trend of average productivity (see Figure S14A), and ﬁndthere is no pre-trend (supporting the use of DID). For this DID analysis we modify Eq. (10) by replacing the industrialsimilarity ϕ i , j with the average productivity ¯ p i , j between pair of provinces i and j . Figure 9B shows a graphicalsummary of the DID analysis using average productivity. The DID (in green) between treatment group (in red) andcontrol group (in blue) is CNY 98,713 ( ∼ USD 15k), meaning that workers in pairs of industries linked by high-speedrail increased their productivity, on average, by CNY 98,713 more than province pairs not connected by rail (seeFigure S14B).Finally, we present our di ﬀ erences-in-di ﬀ erences analysis for productivity and the instroduction of high-speed rail25n the last three columns of Table 4. Here we ﬁnd that the interaction term (High-speed Rail Entry) is positive andsigniﬁcant, and it is robust to controlling for di ﬀ erences in population, GDP per capita, urbanization, and trade. Theseresults support the idea that the introduction of high-speed rail promoted learning, since the productivity of industriesincreased in the provinces that were connected by rail to provinces with productive ﬁrms in that industry.

4. Conclusion and Discussion

In this paper we explored the expansion of the Chinese economy between 1990 and 2015 by looking at theindustrial diversiﬁcation of Chinese provinces. First, we explored inter-industry learning by constructing the industryspace, and showed that the probability that an industry will emerge in a province increased with the number of relatedindustries already present in it. Next, we explored inter-regional learning and used geographic data to show that theprobability that an industry will emerge in a province increases with the presence of the same industry in neighboringprovinces. Then, we combined both of the results to study whether inter-industry and inter-regional learning reinforceeach other, and found that the combination of the two learning channels exhibit diminishing returns, meaning thatwhen one learning channel (inter-regional or inter-industry) is su ﬃ ciently active, the other channel does not contributeas much. That also implies that inter-regional learning and inter-industry learning are substitutes, or that learning isconstrained by the absence of a single learning opportunity.Moreover, we use the introduction of high-speed rail between provinces as an instrument to address endogenityconcerns and provide evidence in support of the inter-regional learning hypothesis. We study how the introductionof high-speed rail a ﬀ ected the industrial similarity of the provinces connected by rail and the productivity of ﬁrmsusing di ﬀ erences-in-di ﬀ erences (DID). First, we show that the introduction of high-speed rail signiﬁcantly increasedthe industrial similarity of the pairs of provinces connected by rail. Second, we compare the average productivityof industries that were present in pairs of neighboring provinces that became connected by rail, with that of pairsof neighboring provinces where these industries were present, but did not become connected by rail. We found thatthe average productivity of pairs of neighboring provinces that were connected by rail increased in the presence of aproductive neighbor in that industry. These results provide evidence in support of inter-regional learning theories.While encouraging, our results should be interpreted in the light of their limitations. For instance, the observed26resence of new industries is limited to those with revealed comparative advantage in a province, instead of industrieswith a large absolute number of ﬁrms. That means industries without revealed comparative advantage are consideredabsent in our context, which can be a potential limitation. Also, our data and geographic resolution are limited. On theone hand, our data captures ﬁrms listed in China’s two major stock markets (Shanghai and Shenzhen), which representonly a small fraction of all Chinese ﬁrms. Therefore, it is biased towards larger ﬁrms, since larger ﬁrms are more likelyto be publicly listed. Moreover, some ﬁrms not listed or listed outside of China are not included even though theyare located and operating in China. On the other hand, the use of provinces is also not ideal. Chinese provinces arerelatively larger administrative units, some of which concentrate more than 100 million people. Improving the spatialresolution of this analysis would be an important improvement.While we provide evidence in support of collective learning at the macro level, we do not provide a micro-channelfor that learning. Is this learning the result of spin-o ﬀ companies? Migrant workers? Supply and demand externalities?Labor market pooling? Or other channels? These micro level explanations are important, but escape the scope of thispaper.Nevertheless, the evidence presented here helps expand the body of literature supporting the idea that economicdevelopment is a learning rather than an accumulation process, and that learning is deeply path dependent, as it isa ﬀ ected by the presence of related industries and the industrial development of neighbors. This should be good newsfor developing countries looking to modernize their planning and economic development e ﬀ orts. We hope this paperhelps stimulate the study of collective learning in economic development, and also that it helps inspire new researchto identify speciﬁc learning channels. Acknowledgments

We acknowledge the support from the MIT Media Lab Consortia and from the Masdar Institute of Technology.We also thank Haixing Dai, Mary Kaltenberg, Yiding Liu, Zhihai Rong, H. Eugene Stanley, Dan Yang, the HumanDynamics group meeting at the MIT Media Lab, and the B4 event by Research Center for Social Complexity (CICS)at Universidad del Desarrollo (UDD) for helpful comments. Jian Gao acknowledges the China Scholarship Councilfor partial ﬁnancial support. This work was supported by the Center for Complex Engineering Systems (CCES) at27ing Abdulaziz City for Science and Technology (KACST) and the Massachusetts Institute of Technology (MIT).

References

Acemoglu, D., Garc´ıa-Jimeno, C., Robinson, J. A., 2015. State capacity and economic development: A network approach. American EconomicReview 105 (8), 2364–2409.Albalate, D., Bel, G., 2012. High-speed rail: Lessons for policy makers from experiences abroad. Public Administration Review 72 (3), 336–349.Bahar, D., Hausmann, R., Hidalgo, C. A., 2014. Neighbors and the evolution of the comparative advantage of nations: Evidence of internationalknowledge di ﬀ usion?. Journal of International Economics 92 (1), 111–123.Balassa, B., 1965. Trade liberalisation and “revealed” comparative advantage. Manchester School 33 (2), 99–123.Bertrand, M., Duﬂo, E., Mullainathan, S., 2004. How much should we trust di ﬀ erences-in-di ﬀ erences estimates? Quarterly Journal of Economics119 (1), 249–275.Besley, T., Case, A., 2000. Unnatural experiments? Estimating the incidence of endogenous policies. Economic Journal 110 (467), 672–694.Boschma, R., 2017. Relatedness as driver of regional diversiﬁcation: A research agenda. Regional Studies 51 (3), 351–364.Boschma, R., Iammarino, S., 2009. Related variety, trade linkages, and regional growth in Italy. Economic Geography 85 (3), 289–311.Boschma, R., Mart´ın, V., Minondo, A., 2016. Neighbour regions as the source of new industries. Papers in Regional Science (Advanced access)doi: 10.1111 / pirs.12215.Boschma, R., Minondo, A., Navarro, M., 2012. Related variety and regional growth in Spain. Papers in Regional Science 91 (2), 241–256.Boschma, R., Minondo, A., Navarro, M., 2013. The emergence of new industries at the regional level in Spain: A proximity approach based onproduct relatedness. Economic Geography 89 (1), 29–51.Cao, J., Liu, X. C., Wang, Y., Li, Q., 2013. Accessibility impacts of China’s high-speed rail network. Journal of Transport Geography 28, 12–21.Catalini, C., Fons-Rosen, C., Gaul´e, P., 2016. Did cheaper ﬂights change the direction of science?. Tech. rep., MIT Sloan, Research Paper No.5172-16.Collins, H., 2010. Tacit and Explicit Knowledge. Chicago: University of Chicago Press.Delgado, M., Porter, M. E., Stern, S., 2010. Clusters and entrepreneurship. Journal of Economic Geography 10 (4), 495–518.Delgado, M., Porter, M. E., Stern, S., 2014. Clusters, convergence, and economic performance. Research Policy 43 (10), 1785–1799.Delgado, M., Porter, M. E., Stern, S., 2016. Deﬁning clusters of related industries. Journal of Economic Geography 16 (1), 1–38.Eichengreen, B., Park, D., Shin, K., 2012. When fast-growing economies slow down: International evidence and implications for China. AsianEconomic Papers 11 (1), 42–87.Ellison, G., Glaeser, E. L., Kerr, W. R., 2010. What causes industry agglomeration? Evidence from coagglomeration patterns. American EconomicReview 100 (3), 1195–1213.Felipe, J., Kumar, U., Usui, N., Abdon, A., 2013. Why has China succeeded? And why it will continue to do so. Cambridge Journal of Economics37 (4), 791–818.Frenken, K., Van Oort, F., Verburg, T., 2007. Related variety, unrelated variety and regional economic growth. Regional Studies 41 (5), 685–697.Fukuyama, F., 1995. Trust: The Social Virtues and the Creation of Prosperity. New York: The Free Press.Garver, J. W., 2006. Development of China’s overland transportation links with Central, South-West and South Asia. China Quarterly 185, 1–22.Granovetter, M., 1985. Economic action and social structure: The problem of embeddedness. American Journal of Sociology 91 (3), 481–510.Hausmann, R., Hidalgo, C. A., Bustos, S., Coscia, M., Simoes, A., Yildirim, M. A., 2014. The Atlas of Economic Complexity: Mapping Paths toPprosperity. Cambridge: MIT Press.Hidalgo, C. A., 2015. Why Information Grows: The Evolution of Order, from Atoms to Economies. New York: Basic Books.Hidalgo, C. A., Hausmann, R., 2009. The building blocks of economic complexity. Proceedings of the National Academy of Sciences, USA106 (26), 10570–10575.Hidalgo, C. A., Klinger, B., Barab´asi, A.-L., Hausmann, R., 2007. The product space conditions the development of nations. Science 317 (5837),482–487. olmes, T. J., 2011. The di ﬀ usion of Wal-Mart and economies of density. Econometrica 79 (1), 253–302.Howell, A., He, C., Yang, R., Fand, C. C., 2016. Agglomeration,(un)-related variety and new ﬁrm survival in China: Do local subsidies matter?.Papers in Regional Science (Advanced access) doi: 10.1111 / pirs.12269.Jiao, J., Wang, J., Jin, F., Dunford, M., 2014. Impacts on accessibility of China’s present and future HSR network. Journal of Transport Geography40, 123–132.Lin, J. Y., 2012. New Structural Economics: A Framework for Rethinking Development and Policy. Washington DC: World Bank Publications.Lin, Y., Qin, Y., Xie, Z., 2015. International technology transfer and domestic innovation: Evidence from the high-speed rail sector in China. Tech.rep., London School of Economics and Political Science, CEP Discussion Paper No 1393.Ne ﬀ ke, F., Henning, M., 2013. Skill relatedness and ﬁrm diversiﬁcation. Strategic Management Journal 34 (3), 297–316.Ne ﬀ ke, F., Henning, M., Boschma, R., 2011. How do regions diversify over time? Industry relatedness and the development of new growth pathsin regions. Economic Geography 87 (3), 237–265.Nelson, R. R., Winter, S. G., 1982. An Evolutionary Theory of Economic Change. Cambridge: Harvard University Press.Polanyi, M., 1958. Personal Knowledge, Towards a Post Critical Epistemology. Chicago: University of Chicago Press.Powell, W. W., 1990. Neither market nor hierarchy. Research in Organizational Behavior 12, 295–336.Qin, Y., 2016. No county left behind? The distributional impact of high-speed rail upgrade in China. Journal of Economic Geography (Advancedaccess) doi: 10.1093 / jeg / lbw013.Rodrik, D., 2006. What’s so special about China’s exports? China & World Economy 14 (5), 1–19.Rolland, N., 2015. China’s new silk road. National Bureau of Asian Research (NBR). .Saxenian, A., 1996. Regional Advantage: Culture and Competition in Silicon Valley and Route 128. Cambridge: Harvard University Press.Semitiel-Garcia, M., Noguera-Mendez, P., 2012. The structure of inter-industry systems and the di ﬀ usion of innovations: The case of Spain.Technological Forecasting and Social Change 79 (8), 1548–1567.Song, Z., Storesletten, K., Zilibotti, F., 2011. Growing like China. American Economic Review 101 (1), 196–233.Storper, M., Venables, A. J., 2004. Buzz: Face-to-face contact and the urban economy. Journal of Economic Geography 4 (4), 351–370.Tacchella, A., Cristelli, M., Caldarelli, G., Gabrielli, A., Pietronero, L., 2012. A new metrics for countries’ ﬁtness and products’ complexity.Scientiﬁc Reports 2, 723.Teece, D. J., Pisano, G., 1994. The dynamic capabilities of ﬁrms: An introduction. Industrial and Corporate Change 3 (3), 537–556.Teece, D. J., 1980. Economies of scope and the scope of the enterprise. Journal of Economic Behavior & Organization 1 (3), 223–247.Teece, D. J., 1982. Towards an economic theory of the multiproduct ﬁrm. Journal of Economic Behavior & Organization 3 (1), 39–63.Teece, D. J., Rumelt, R., Dosi, G., Winter, S., 1994. Understanding corporate coherence: Theory and evidence. Journal of Economic Behavior &Organization 23 (1), 1–30.Zheng, S., Kahn, M. E., 2013. China’s bullet trains facilitate market integration and mitigate the cost of megacity growth. Proceedings of theNational Academy of Sciences, USA 110 (14), E1248–E1253.Zhu, X., 2012. Understanding China’s growth: Past, present, and future. Journal of Economic Perspectives 26 (4), 103–124. ollective Learning in China’s Regional Economic Development(Online Appendix) Jian Gao, Bogang Jun, Alex “Sandy” Pentland, Tao Zhou, C´esar A. Hidalgo

1. China’s ﬁrm data

We use ﬁrm data from China’s stock market extracted from the RESSET Financial Research Database, which isprovided by Beijing Gildata RESSET Data Tech Co., Ltd. (http: // (cid:1) (cid:1) (cid:1) Su sisting Delisting o f r m s Year isting o f r m s Year CA o f r m s Year B Figure S1: Number of ﬁrms which are listed, delisted and subsisting in each year. (A) Number of newly listed ﬁrms. (B) Number of newly delistedﬁrms. (C) Number of subsisting ﬁrms, i.e., the cumulative number of listed ﬁrms which is not delisted yet. // C S RC G : T r an s po r t : Load i ng / un l oad i ng hand li ng agen cy : W a t e r w a y t r an s po r t : R a il w a y t r an s po r t a t i on : S t o r age i ndu s t r y : R oad t r an s po r t : A i r t r an s po r t N : U t ili t y : P ub li c f a c ili t y m anage m en t : E c o l og i c a l p r o t e c t i on M : S c i en t i f i c : P r o f e ss i ona l t e c hn i c a l s e r v i c e R : E n t e r t a i n m e n t : I n d u s t r y o f c u l t u r e a n d a r t s : P r e s s a n d p u b li s h i n g : T e l e v i s i o n r e c o r d i n g p r o d u c t i o n I : I T : I n t e r n e t a n d r e l a t e d s e r v i c e s : T e l e v i s i o n t r a n s m i s s i o n s e r v i c e s : I n f o r m a t i o n t e c h n o l o g y s e r v i c e s J : F i nan c e : O t he r f i nan c i a l i ndu s t r i e s

66: Monetary and financial services : I n s u r an c e i ndu s t r y : C ap i t a l m a r k e t s e r v i c e s A : A g r i c u l t u r e : A n i m a l hu s band r y : F o r e s t r y : A g r i c u l t u r e : F i s he r y C : M anu f a c t u r i ng : N on - m e t a lli c m i ne r a l p r odu c t s : C he m i c a l p r odu c t s m anu f a c t u r i ng : C he m i c a l f i be r m anu f a c t u r i ng : P ha r m a c eu t i c a l i ndu s t r y : F e rr ou s m e t a l s m e l t i ng : P r i n t i ng and r ep r odu c t i on : P ape r m a k i ng : N on - f e rr ou s m e t a l s m e l t i ng : S t a t i one r y m anu f a c t u r i ng : F u r n i t u r e m anu f a c t u r i ng : T e x t il e i ndu s t r y : T r an s po r t a t i on equ i p m en t m anu f a c t u r i ng : W ood and s t r a w p r odu c t : G ene r a l equ i p m en t m anu f a c t u r i ng : R ubbe r and p l a s t i c p r odu c t s : I n s t r u m en t and m e t e r m anu f a c t u r i ng : E l e c t r on i c equ i p m en t m anu f a c t u r i ng : P e t r o l eu m and nu c l ea r f ue l p r o c e ss i ng : Lea t he r s and f oo t w ea r : A u t o m ob il e m anu f a c t u r i ng : A l c oho l m anu f a c t u r i ng : O t he r m anu f a c t u r i ng i ndu s t r i e s : T e x t il e ga r m en t : A g r i c u l t u r a l f ood p r o c e ss i ng : S pe c i a l - pu r po s e equ i p m en t m anu f a c t u r i ng : M e t a l p r o d u c t : E l e c t r i c m a c h i n e r y m a n u f a c t u r i n g : F o o d m a n u f a c t u r i n g O : R e s i d e n t : O t h e r s e r v i c e i n d u s t r i e s L : L e a s i n g7 2 : C o m m e r c i a l s e r v i c e B: Mining : E x p l o i t a t i on au x ili a r y a c t i v i t i e s : O il e x p l o i t a t i on

08: Ferrous metal ore mining : C oa l m i n i ng and d r e ss i ng : N on - f e rr ou s m e t a l o r e m i n i ng D : E l e c t r i c i t y : G a s p r odu c t i on and s upp l y : W a t e r p r odu c t i on and s upp l y : E l e c t r i c po w e r and hea t p r odu c t i on K : E s t a t e70 : R ea l e s t a t e S : D i v e r s i f i ed90 : D i v e r s i f i ed i ndu s t r i e s E : C on s t r u c t i on : A r c h i t e c t u r a l de c o r a t i on i ndu s t r i e s : C i v il eng i nee r i ng c on s t r u c t i on Q : H ea l t h83 : H ea l t h F : R e t a il : W ho l e s a l e i ndu s t r y : R e t a il i ndu s t r y H : C a t e r i ng : A cc o mm oda t i on i ndu s t r y : C a t e r i ng i ndu s t r y Figure S2: The codes of industry after the aggregation of ﬁrms into two levels: sectoral and sub-sectoral level. The inside layer corresponds to thesectoral level, while outside layer shows the sub-sectoral level. Here, we don’t list all the sub-sectoral level.

2. Distance, travel time, and macroeconomic indicators

Regarding inter-regional learning, we use three distance metrics: geographic, driving, and neighboring distance(see Table S1). We deﬁne the geographic distance ( D i , j ) between provinces i and j as a geodesic distance betweenthe capital cities of the two provinces. The driving distance ( V i , j ) is the shortest route between the capital cities of thetwo provinces, according to Google Maps API in 2015. The neighboring distance B i , j is deﬁned as the least numberof provinces that one province has to cross in order to reach another province. For example, the neighboring distancebetween Beijing and Shandong is two ( B i , j = Table S1: Summary statistics of related economic indicators.Variable Description Unit Obs Min Max Mean Std. Dev.A. Province LevelPopulation Resident population at year-end 10k person 31 3 . × . × . × . × GDP per capita Per capita gross domestic product 1 CNY / person 31 2 . × . × . × . × Urban Area Total urban area in a region 1 sq.km 31 3 . × . × . × . × Land Area Total land area in a region 1 sq.km 31 6 . × . × . × . × Trade Total value of imports&exports 1k USD 31 6 . × . × . × . × B. Province-pair LevelGeographical Distance Between two capital cities 1k km 465 114 3559 1369.4 723.0Driving Distance Between two capital cities 1k km 465 139 4883 1740.9 962.3Neighboring Distance Number of regions crossed /

465 1 6 2.9 1.3Transit Time Shortest travel time by transit 1 h 465 0.6 71 19.8 14.2Normal-train Time Shortest travel time normal-train 1 h 465 1.6 71 25.5 14.3Driving Time Shortest travel time by driving 1 h 465 1.9 59 19.4 11.5 ∆ Population (log) Di ﬀ erence in resident population /

465 0.0071 3.5182 0.9330 0.7650 ∆ GDP per capita (log) Di ﬀ erence in GDP per capita /

465 0.0000 1.3815 0.4502 0.3316 ∆ Urbanization Di ﬀ erence in urban area / land area /

465 0.0053 262.31 7.4503 20.973 ∆ Trade (log) Di ﬀ erence in imports&exports /

465 0.0058 7.6024 1.9055 1.4401

Notes : The summary statistics of macroeconomic data, distance metrics and travel time measures are in 2014, 2015 and 2015, respectively. // ∆ Population (log), ∆ GDP per capita (log), ∆ Urbanization, and ∆ Trade (log) in 2014.

3. Representation of industry space

We build a “province-industry” bipartite network G = { P , I , E } to connect provinces and industries (see Figure S3),where P is the set of provinces, I is the set of industries at sub-sectoral level, and E is the set of links. The weight oflink x i ,α is the number of ﬁrms in province i that operate in industry α . In the following, i and α indicate province-related and industry-related indices, respectively.To visualize the network of industry, we build a industry space using a proximity matrix Φ , which is associatedwith the similarities between each pair of industries at the sub-sectoral level. There are three steps in building theindustry space: (i) First step is to build a maximum spanning network, as shown in Figure S4A. We calculate themaximum spanning tree so that all nodes becomes reachable in the network with minimum number of links. Thisnetwork includes 69 links that ensures the connectivity and maximizes the total proximity. (ii) Then, we build amaximum weighted network, depicted in Figure S4B, using only links of which weight exceeds a certain threshold φ ′ . We set the threshold φ ′ as 0.81, under which the network includes 116 links and provides a distinguishable ﬁnal4 P P I I I I Province IndustryFirm A P P P I I I I Province

Industry P i I α B x i, α Figure S3: The bipartite network of “province-industry” pairs. P and I represent provinces and industries at the sub-sectoral level, respectively.The weight of link x i ,α corresponds to the number of ﬁrms in province i that belong to industry α .

81 83 7470 72 77785152 485061 6285 86 874445 466364 65 010203 04 6667 686906 070809 11535455 5658 5913 1415 1718 1920 2122 2324 2526 272829 3031323334 3536 37 38 394041

DA B C

Figure S4: How to construct the industry space.(A) The ﬁrst step: Building a maximum spanning network. (B) The second step: building amaximum weighted network with φ > .

81. (C) The last step: Building a superposed network by combining the maximum spanning network andthe maximum weighted network. (D) Layout of the product space, using a ForceAtlas2 algorithm in Gephi. (E) The ﬁnal outcome: the industryspace. The color of nodes corresponds to 18 industries at sectoral level. The size of nodes is proportional to the number of listed ﬁrms in thatindustry. The color and weight of links is associated with the φ value between two industries. visualization. (iii) Last, we combine these two networks, which are the maximum spanning network and the maximumwighted network, to build a superposed network (Figure S4C). After the last step, the network includes 145 links and70 nodes, which represent 70 industries at sub-sectoral level.To make a better network visualization, we use a ForceAtlas2 algorithm of Gephi (http: // gephi.github.io) in laying5 ndustry Subcategory I ndu s t r y S ub c a t ego r y P r o x i m i t y A B F r equen cy Proximity

Figure S5: (A) Hierarchically clustered matrix based on the original proximity matrix ( Φ ). The colors indicate the value of proximity. (B) Thedistribution of the proximity in matrix Φ . The proximity matrix is calculated based on data in year 2015. out the superposed network. ForceAtlas2 is a force directed layout, which places each node with consideration of theother nodes and allow to avoid overlapping links and untangle dense clusters. Figure S4D shows the layout of industryspace. After preparing the skeleton, we adjust the size of nodes according to the number of ﬁrms in that industry atthe sub-sectoral level,and color each nodes according to the industries at the sectoral level. Likewise, we adjust thethickness and color of links according to the proximity. Finally, the industry space is depicted in Figure S4E. The dataof 2015 is used for the visualization of industry space.Regarding the proximity, Figure S5A represents the proximity matrix Φ in a way of a hierarchically clusteredmatrix. The matrix shows two big modules and some small modules, supporting the existence of two density cores inindustry space. Figure S5B describes the density distribution of the proximity values in matrix Φ . We can see that thevale of proximity follow a normal-like distribution with its average value around 0.5.

4. Robustness check of inter-industry learning

To check the robustness of inter-industry learning, here we also explore the relationship between the density ofactive related industries and the present of new industries in provinces. Figure S6A presents the relationship betweenthe number of industries, in which provinces have a revealed comparative advantage, and the number of new industries,in which provinces have developed a comparative advantage ﬁve years in the future. Using China’s stock market data,we count the number of industries in year 2001 and check to see if new industries emerge ﬁve years in the future by6

10 15 20 2502468 0.05 0.10 0.15 0.20 0.2502468

BJTJ HESXNM LNJLHL SHJSZJAHFJJX SDHA HBHN GDGX HICQ SCGZYNXZ SNGSQH NX XJ BJTJ HESXNM LNJLHL SHJSZJAHFJJX SDHA HBHN GDGXHICQ SCGZYNXZ SNGSQH NX XJ f N e w I ndu s t r i e s w i t h RC A > A B f N e w I ndu s t r i e s w i t h RC A > Average Density in Industries with RCA<1

Figure S6: (A) Relationship between active industries at time t and new active industries at time t + t and new active industries at time t + looking at year 2006, and repeat this pattern over 2001 to 2015. More speciﬁcally, we will check the pairs of years(2002, 2007), (2003, 2008), ..., (2010, 2015).In Figure S6A, each dot represents each province of which value in horizontal-axis is corresponding to the averagenumber of current industries with RCA > t in time pairs in the period (2001-2015). The value in vertical-axisis corresponding to the average number of new industries with RCA > t + RCA <

1) at the beginning ( t ) but developed a revealed comparativeadvantage (i.e., with RCA >

1) ﬁve years later ( t + RCA > RCA < RCA < RCA < φ α,β, t ) in7 able S2: Abbreviations of province names in China.ID Province Name Code ID Province Name Code ID Province Name Code1 Beijing BJ 12 Anhui AH 23 Sichuan SC2 Tianjin TJ 13 Fujian FJ 24 Guizhou GZ3 Hebei HE 14 Jiangxi JX 25 Yunnan YN4 Shanxi SX 15 Shandong SD 26 Tibet XZ5 Inner Mongolia NM 16 Henan HA 27 Shaanxi SN6 Liaoning LN 17 Hubei HB 28 Gansu GS7 Jilin JL 18 Hunan HN 29 Qinghai QH8 Heilongjiang HL 19 Guangdong GD 30 Ningxia NX9 Shanghai SH 20 Guangxi GX 31 Xinjiang XJ10 Jiangsu JS 21 Hainan HI11 Zhejiang ZJ 22 Chongqing CQ F r equen cy Density of Active Related Industries

New Industry in 5 YearsNo New Industry in 5 Years P r obab ili t y o f N e w I ndu s t r y i n5 Y ea r s Density of Active Related Industries

A B

Figure S7: (A) Distribution of the density of active related industries for each pair of provinces and industries. The pink distribution focuses onlyon pairs of provinces and industries that developed revealed comparative advantage in the next ﬁve years. The blue distribution is for the pairs ofindustries and provinces that did not develop revealed comparative advantage. The mean of the pink distribution is signiﬁcantly larger than that ofthe blue distribution (ANOVA p-value < .

01. (B) Probability that a new industry will appear in a province as a function of the density of activerelated industries ( ω ). Bars indicate average values and error bars indicate standard errors. Results show averages for 2001-2015 using ﬁve-yearintervals. The density ω in Eq. 3 in the main text is calculated using a time-varying industrial proximity ( φ α,β, t ). the main text when calculating the density of related industries in Eq. (3) in the main text. Again, we use China’sstock market data with ﬁve years interval. Figure S7A shows the distribution of related industry densities for pairs ofindustries and provinces that developed revealed comparative advantage (in pink) and that did not develop revealedcomparative advantage (in blue) within ﬁve years. We ﬁnd that the average related industry density for the pairs ofindustries and provinces in which developed revealed comparative advantage is signiﬁcantly larger (ANOVA p-value < . . Robustness check of inter-regional learning

51: Wholesale38: Elec. Machinery27: Pharmaceutical26: Chemical o f f r i m s o f f i r m s o f f r i m s o f f i r m s o f f i r m s o f f i r m s Figure S8: Evolution of the presence of industries in China between 1992 and 2015. Four illustrated industries are Chemical Products Manufac-turing Industry, Pharmaceutical Industry, Electric Machinery Manufacturing Industry, and Wholesale Industry (the keys of labels correspond toFigure S2. The saturation of the color indicates the number of ﬁrms.

To check the robustness of the observations in Figure 4 in the main text, we additionally show the spatial evolu-tion of the presence of industries in Chinese provinces using the number of ﬁrms of that industry in that province.Figure S8, for instance, presents the results of four industries: Chemical Products Manufacturing Industry, Pharma-ceutical Industry, Electric Machinery Manufacturing Industry, and Wholesale Industry (the keys of labels correspond9

10 20 30 40 50 600.200.250.300.350.400.45 0 10 20 30 40 50 600.200.250.300.350.400.450 10 20 30 40 50 600.200.250.300.350.400.45 0 1000 2000 3000 4000 50000.200.250.300.350.400.45

DBCA I ndu s t r i a l S i m il a r i t y Transit Time (h)

Pearson's r = -0.41 I ndu s t r i a l S i m il a r i t y Normal-Train Time (h)

Pearson's r = -0.36 I ndu s t r i a l S i m il a r i t y Driving Time (h)

Pearson's r = -0.38 I ndu s t r i a l S i m il a r i t y Driving Distance (km)

Pearson's r = -0.36

Figure S9: Relationship between industry similarity and (A) transit time, (B) normal-train time, (C) driving time, and (D) driving distance. Barcharts with error bars correspond to average values with stand errors in bins. Blue dash lines are linear ﬁts of the corresponding bar charts. to Figure S2. The saturation of the color indicates the number of ﬁrms. In this ﬁgure, the provinces that have largenumber of ﬁrms in an industry tend to be neighbors of provinces who already had a large number of ﬁrms in thatindustry, supporting our main ﬁnding.Further, we present other evidences on the negative correlation between geographic proximity and the industrialsimilarity. More speciﬁc, in Figure S9 we show the industrial similarity is highly correlated with transit time (A),normal-train time (B), driving time (C), and driving distance (D). We conﬁrm that shorter travel time or closer distancebetween two regions corresponds to more similar industrial structure between the two.To test the robustness of the results on inter-regional learning that is depicted in Figure 6 in the main text, wedevelop an alternative index to measure the density of active neighboring provinces. That is using neighboring distance10 .0 0.1 0.2 0.3 0.4 0.5 0.60.000.050.100.150.200.25 F r equen cy Density of Active Neighboring Provinces

New Industry in 5 YearsNo New Industry in 5 Years P r obab ili t y o f N e w I ndu s t r y i n5 Y ea r s Density of Active Neighboring Provinces

A B

Figure S10: (A) Distribution of the density of active neighboring provinces for each pair of provinces and industries. The pink distribution focusesonly on pairs of provinces and industries that developed revealed comparative advantage in the next ﬁve years. The blue distribution is for the pairsof industries and provinces that did not develop revealed comparative advantage. The mean of the pink distribution is signiﬁcantly larger than thatof the blue distribution (ANOVA p-value = . × − ). (B) Probability of a province developing comparative advantage in an industry as a functionof the density of active neighboring provinces ﬁve years ago. Bars indicate average values and error bars indicate standard errors. Results showaverages for 2001-2015 using ﬁve-year intervals. The density Ω i ,α, t in Eq. (S1) is weighted by neighboring distance B i , j . B i , j to replace geographic distance D i , j when calculating the density of active neighboring provinces. Formally, Ω i ,α, t = X j U j ,α, t B i , j ,X j B i , j . (S1)Figure S10A shows the distribution of densities ( Ω ) for industry-province pairs that developed revealed compara-tive advantage in an industry in a ﬁve-year period (in pink) and those who did not (in blue). We ﬁnd that the averagedensity of active neighboring province for the pairs of industries and provinces that developed revealed comparativeadvantage is signiﬁcantly larger (ANOVA p-value = . × − ). Figure S10B shows the increasing and convex rela-tionship between the probability that a province will develop revealed comparative advantage in an industry and thedensity of active neighboring provinces in that industry ﬁve years before.

6. Robustness check of inter-regional and inter-industrial learning

To check the robustness of our results in Figure 7 in the main text, we use an alternative index (ratio) to measurethe density of active neighboring provinces ( Ω ) and the density of related industries ( ω ). For provinces, the ratio is theproportion of active neighboring provinces, and for industries, the ratio is the proportion of active related industriesaccording to the illustrated industry space in 2015. Figure S11A shows the joint probability of new industries present,11 .0 0.2 0. (cid:2) (cid:2) (cid:2) (cid:2) (cid:2) R a t i o o f A c t i v e R e l a t ed I ndu s t r i e s Ratio of Active eigh oring rovinces r oa ili t y o f e I ndu s t r y i n Y ea r s Ratio of Active eigh oring rovinces

CBA

Ratio of Active Related Industries

Figure S11: (A) Joint probability of a province developing revealed comparative advantage in a new industry in a ﬁve-year period, given the ratioof active neighboring provinces in horizontal-axis and the ratio of active related industries in vertical-axis. (B) and (C) are the correspondingmarginal probability distributions of new industries present, given the ratio of active related industries and the ratio of active neighboring provinces,respectively. given the ratio of active neighboring provinces and the ratio of active related industries. We can see that both ratioshave signiﬁcant e ﬀ ects on the new industries present. For each single e ﬀ ect, the ratio of active related industries (seeFigure S11B) and the ratio of active neighboring provinces (see Figure S11C), the increasing and convex relation-ship shows that the probability that an industry will develop revealed comparative advantage in a province increasesstrongly with the ratio. These results support the robustness of our results.Table S3 presents the summary statistics of regression variables that are used in econometrics considering bothinter-regional and inter-industry learning. Four di ﬀ erent groups of metrics are included in the multivariable regres-sions: density, ratio, number, and active number. Here, the density of active related industries, the number of relatedindustries, and the number of active related industries are all based on the illustrated industry space in 2015.

7. Robustness check of causal evidence for inter-regional learning

Figure S12A shows the e ﬀ ect of high-speed rail entry on industrial similarity by comparing province pairs with(in pink) and without (in blue) high-speed rail lines. We ﬁnd that the average value of industrial similarity betweenprovince pairs that connected by high-speed rail is signiﬁcant larger (ANOVA p-value = . × − ). Figure S12Bshows the timing of high-speed rail entry in China and its e ﬀ ect on the industrial similarity of province pairs. We can12 able S3: Summary statistics of regression variables in the analysis of the emergence of new industries.Variable Observations Min Max Mean Std. Dev.Density of Active Neighboring Provinces 25713 0 0.7127 0.1894 0.1551Density of Active Related Industries 25713 0.0109 0.5939 0.2283 0.0866Interaction Term 1 25713 0 0.3022 0.0453 0.0441Ratio of Active Neighboring Provinces 25713 0 1 0.1816 0.2358Ratio of Active Related Industries 25713 0 1 0.1949 0.2884Interaction Term 2 25713 0 1 0.0457 0.1063Number of Active Neighboring Provinces 25713 0 7 0.7935 1.0441Number of Active Related Industries 25713 0 9 0.8448 1.2864Interaction Term 3 25713 0 45 1.0833 2.9363Number of Neighboring Provinces 25713 1 8 4.4463 1.8048Number of Related Industries 25713 1 15 3.8851 3.3691Interaction Term 4 25713 1 120 17.271 17.712 Speed-Up Campaigns A k m / h 300 k m / h120 k m / h A v a r age I ndu s t r i a l S i m il a r i t y Year

High-speed RailNo High-speed Rail B F r equen cy Industrial Similarity

High-speed RailNo High-speed Rail

Figure S12: (A) Density distributions of industrial similarity for province pairs with (in red) or without (in blue) high-speed rail. Red and bluecurves are normal ﬁts of the bar charts. The mean of the pink distribution is signiﬁcantly larger than that of the blue distribution (ANOVAp-value = . × − ). (B) Average industry similarity between province pairs with (in red) or without (in blue) high-speed rail. see that the average industrial similarity increases remarkably after train speed-up in 2005, 2008 and 2012 (the yearsafter “speed-up” campaigns are used for illustration), suggesting the positive and signiﬁcant e ﬀ ect of high-speed railon inter-regional learning. The average industrial similarity for province pairs with and without high-speed rail from2004 to 2015 supports the robustness of these ﬁndings.To provide additional evidence supporting inter-regional learning, we do di ﬀ erences-in-di ﬀ erences (DID) analysisagain but considering the productivity. First, we check the relationship between the industrial similarity and produc-tivity for all pairs of provinces. One possible concern of our analysis is that the productivity of industries is directlya ﬀ ected by their industrial structure since some industries may have higher productivity than others, and the industrialstructure that one province has may contribute to its productivity. In that case, the analysis of industrial similarity and13

000 2002 2004 2006 2008 2010 2012 2014-1.0-0.50.00.51.0

K F L E D O S J N M C G R B A I H Q01x10 A v e r age P r odu c t i v i t y ( ) Industry Categories BA C o rr e l a t i on C oe ff i c i en t Year

Figure S13: (A) Average productivity of industries in descending order in 2014. The keys of industry categories correspond to Figure S2. (B)Correlation coe ﬃ cient between industrial similarity and productivity for all pairs of provinces for the 2000-2014 period. productivity between pairs of provinces is not repetitively.We ﬁnd that even though di ﬀ erent industries have di ﬀ erent productivity (see Figure S13A for illustration in 2014),the correlation between industrial similarity and productivity for all pairs of provinces is relatively small (see Fig-ure S13B), meaning that industrial similarity and productivity are, to some extent, independent of each other, andconsidering both of these two measures at the same time is valid.For doing di ﬀ erences-in-di ﬀ erences (DID) analysis considering the productivity, once again, our data satisﬁes thecondition for DID method as the pre-trend of the dependent variable on the control and treatment groups is similarprior to year 2005 (see Figure S14A). To demonstrate this, we do the event study by running the following ordinaryleast-squares (OLS) linear regression model using data between 2000 and 2015 to predict the average productivity ofprovinces i and j for each year as: ¯ p i , j ,α, t = β + X k = β k ( T reat i , j ∗ { t = k } ) + ε i , j . (S2)Here T reat i , j is a dummy variable denoting whether provinces i and j are a ﬀ ected by high-speed rail entry, and 1 { t = k } is an event time indicator, which is equal to 1 for the year that we consider the e ﬀ ect of high-speed rail entry. In anotherway, Eq. (S2) regresses the average productivity of pairs of provinces and whether there is high-speed rail connectingthem. Larger regression coe ﬃ cient ( β k ) corresponds to higher productivity of pairs of provinces that connected byhigh-speed rail than that not. 14

000 2005 2010 20150.05.0 x10

1. x10

1. x10 BA S p eed -Up C(cid:3)(cid:4)p aigns (cid:5)/(cid:6) (cid:5)/(cid:6) (cid:5)/(cid:6) R eg r e ss i on C oe ff i c i en t Year (cid:5)/(cid:6) (cid:5)/(cid:6) (cid:5)/(cid:6) S p eed -Up C(cid:3)(cid:4)p aignsHig h-sp eed RailNo Hig h-sp eed Rail A v e r age P r odu c t i v i t y Year

Figure S14: (A) Event study results. The y -axis shows the regression coe ﬃ cient ( β k in Eq. (S2)) as a function of the year, after regressing theaverage productivity of pairs of provinces that were eventually connected by high-speed rail against the entry of high-speed rail. Red lines arelinear ﬁts for 2000-2005 and 2005-2015. respectively. (B) Average productivity of province pairs with (in red) or without (in blue) high-speed railbetween 2004 and 2014.Table S4: Summary statistics of variables in di ﬀ erence-in-di ﬀ erence (DID) analysis. Mean values of industrial similarity, average productivity, anddi ﬀ erences in population (log), GDP per capita (log), urbanization and trade (log) between province pairs before and after the entry of high-speedrail are shown.Independent Variables Before After DIDControl Treatment Control TreatmentIndustrial Similarity 0.2496 0.3133 0.2994 0.3921 0.0290Productivity 5 . × . × . × . × . × ∆ Population (log) 1.1394 0.7314 1.0953 0.6514 -0.0358 ∆ GDP pc (log) 0.5717 0.6255 0.4603 0.4327 -0.0814 ∆ Urbanization 0.1066 0.2115 0.1098 0.2151 0.0005 ∆ Trade (log) 2.0719 1.5899 2.2047 1.3863 -0.3365Observations 295 170 295 170 930

Figure S14A illustrates the results of event study regression coe ﬃ cient ( β k ) based on productivity after taking 2005as the baseline year. Before the high-speed rail entry (1997-2005), the e ﬀ ect of the treatment is similar as there is nosigniﬁcant trend in β k . After the high-speed rail entry (2005-2015), however, the e ﬀ ect of the treatment ( β k ) begins toincrease signiﬁcantly, meaning that the treated province pairs became more productive only after the introduction ofhigh-speed rail. Figure S14B presents the the average productivity of province pairs that connected by high-speed railans these not. It can be seen that, province pairs that connected by high-speed rail have signiﬁcant larger productivityfor the whole considered period.Table S4 shows the summary statistics of variables that are used in the di ﬀ erences-in-di ﬀ erences (DID) analysis.In our DID design, province pairs belong to the treatment group if they are connected by high-speed rail in 2014,15 able S5: DID regressions considering the e ﬀ ect of high-speed rail entry on the industrial similarity and the productivity of industries withcontrolling for the geographic distance between provinces.Independent Variables DID Regressions Using OLS ModelIndustrial Similarity Productivity(1) (2) (3) (4) (5) (6)High-speed Rail Entry 0.0290* 0.0270* 0.0274* 98713*** 107032*** 105193***(0.0151) (0.0150) (0.0151) (27644) (27237) (26044)Treatment Group 0.0504*** 0.0455*** 0.0473*** 52294*** 40611** 34883*(0.0112) (0.0114) (0.0113) (16948) (17528) (17923)After Entry 0.0498*** 0.0470*** 0.0504*** 364939*** 376375*** 361673***(0.0089) (0.0089) (0.0089) (17498) (17261) (16491)Distance (log) -0.0297*** -0.0261*** -0.0278*** 29265** 24052** 19635*(0.0069) (0.0069) (0.0070) (11579) (11401) (10927) ∆ Population (log) -0.0182*** -8902(0.0048) (8774) ∆ GDP per capita (log) -0.0176** 106183***(0.0081) (17249) ∆ Urbanization 0.0146 214723***(0.0130) (33949) ∆ Trade (log) -0.0049** 19564***(0.0024) (4635)Observations 930 930 930 930 930 930Robust R × × × Notes : Data are for the year 2004 (before high-speed rail entry) and 2014 (after high-speed rail entry). Signiﬁcant level: ∗ p < . ∗ ∗ p < . ∗ ∗ ∗ p < . otherwise belong to the control group. In the DID regressions, control variables include gravity considerations: thedi ﬀ erence between population, GDP per capita, urbanization (deﬁned as share of urban area over the entire area of aprovince), and trade (deﬁned as total exports and imports of each province).To check the robustness of our results in Table 4 in the main text, we additionally control for the geographicdistance between provinces in the DID analysis to reduce sampling bias. As shown in Table S5, we ﬁnd that theestimates of the e ﬀﬀ