Coherent diversification in corporate technological portfolios
Emanuele Pugliese, Lorenzo Napolitano, Andrea Zaccaria, Luciano Pietronero
CCoherent diversification in corporate technological portfolios
Emanuele Pugliese , Lorenzo Napolitano , Andrea Zaccaria* , and Luciano Pietronero Department of Physics, Sapienza University of Rome, Rome, Italy Institute for Complex Systems, Consiglio Nazionale delle Ricerche - Rome, Italy. * [email protected] July 10, 2017
Abstract
We study the relationship between firms’ performance and their technological portfoliosusing tools borrowed from the complexity science. In particular, we ask whether the accu-mulation of knowledge and capabilities related to a coherent set of technologies leads firmsto experience advantages in terms of productive efficiency. To this end, we analyzed both thebalance sheets and the patenting activity of about 70 thousand firms that have filed at leastone patent over the period 2004-2013. From this database it is possible to define a measureof the firms’ coherent diversification, based on the network of technological fields, and relateit to the firms’ performance in terms of labor productivity. Such a measure favors companieswith a diversification structure comprising blocks of closely related fields over firms withthe same breadth of scope, but a more scattered diversification structure. We find that thecoherent diversification of firms is quantitatively related to their economic performance andcaptures relevant information about their productive structure. In particular, we prove ona statistical basis that a naive definition of technological diversification can explain laborproductivity only as a proxy of size and coherent diversification. This approach can be usedto investigate possible synergies within firms and to recommend viable partners for mergingand acquisitions. a r X i v : . [ q -f i n . E C ] J u l Introduction
The aim of this paper is to shed light on the interactions and positive synergies that take placebetween corporate R&D activities in different fields as reflected by the composition of the patentportfolios of a large sample of firms. In particular, we show that benefits for patenting companiesaccrue not so much from the number of technologies in which they perform R&D, but ratherfrom the average size of the coherent blocks of knowledge stock in which their research activitiesconcentrate. As we will see, such benefits can be measured in terms of productive efficiency.Such counter-intuitive finding is however coherent with a representation of production in whichcoherent knowledge blocks map to distinct internally consistent product lines. In order toillustrate the relevance of knowledge blocks, we define a measure of coherent diversification thatweighs the fields of technology contained in corporate technological portfolios based on theircoherence with respect to the firm’s global knowledge base. As a result, given the same breadthof scope, this measure distinguishes companies with a diversification structure comprising blocksof closely related fields from companies with a more scattered technological portfolios. The ideais that a coherent diversification structure, being a direct reflection of production lines, leads tobetter economic performances.While even a simpler theory of the firm would be able to capture the relevance of the technologyscope for the productivity of the firm (Penrose, 1959), a more refined capabilities-based approachis necessary to argue that potential returns to scope are related to the nature and complementar-ity of the pursued technologies , and not simply their number. Capabilities have been originallydefined as intangible assets of the firm relating to the necessary know-how enabling the effectivedevelopment of production and other internal organizational processes (Dosi et al., 2008). Acapability-based model of the firm can be seen as a network connecting specific technologicalor organizational capabilities to one or more products, thus creating heterogeneous and nontrivial interactions between specific technological fields. Starting with (Teece et al., 1994), manystudies have tried to exploit data on firms and products to understand the possible synergiesbetween different products . A different perspective on the same concept has been championed inrecent years by the literature on economic complexity, which has modeled capabilities as an in-visible layer linking economic agents to the outcome of their activities (Hausmann and Hidalgo,2011) and has also successfully attempted to extend the notion outside the corporate domain byapplying it to nations and geographical regions in general (Tacchella et al., 2012; Zaccaria et al.,2014). In a way, this work lies in-between the traditional and the complexity view by model-ing capabilities as a hidden layer while interpreting them as mediators between firms and theirproductive efforts. Differently from both the above approaches, however, the analysis focuseson the production of technological innovation instead of production in the traditional sense,thus applying a notion that is also close in spirit to the technological competencies proposed byGranstrand et al. (1997).The following sections will lead up to the main exercise and the discussion of the results by firstbriefly describing the data we employ for the study; reviewing the relevant literature on the topicof corporate diversification while presenting an overview of prominent diversification measures;finally, we will turn to the linked concepts of relatedness and corporate coherence, which areuseful stepping stones to make the link between our proposal and the existing literature explicitwhile highlighting its original contribution to the field.
Our aim is to investigate the relation between the structure of the technological portfolios offirms and and their performance and efficiency. To this end, we rely on AMADEUS, a commer-cial database maintained by Bureau van Dijk Electronic Publishing (BvD) which specializes inproviding financial, administrative, and balance sheet information about (almost exclusively pri-vate) companies around the world. In particular, AMADEUS focuses on European enterprises.The database is comprehensive, though not exhaustive, accounting for over 20 million companiessourced from several providers using a multitude of data typically collected by public institutions(Ribeiro et al., 2010). The AMADEUS sample has a straightforward connection with the patentdatabase PATSTAT, described in the following section, because both index patent applicationswith the same patent identifier. Joining the two datasets yields detailed information about al-most 70 thousand firms that have filed at least one patent over the period 2004-2013 covered byour AMADEUS edition and display the balance sheet information we require concerning firmsize and productivity. A drawback associated to AMADEUS is that it concerns a set of firmswhich is geographically constrained to Europe. As consequence, our results draw on evidencebased on companies with at least one subsidiary in Europe, which implies that only relatively3arge non-European firms are included. Section 2.2 explains the strategy we adopt to balancethe sample and limit geographical biases.
Following an established tradition in the economic literature on innovation (Griliches, 1990;Hall et al., 2001; Strumsky et al., 2012), we proxy innovative activity with patents, a rich andgrowing source of information, which over the past years has benefited from cumulative datacollection efforts of scholars as well as public agencies. For the present analysis we concentrateon information concerning the set of technological fields to which inventions pertain; each field isrepresented by a standard code defined within the International Patent Classification (IPC), aninternationally recognized hierarchical classification system maintained and constantly updatedby the World International Patent Organization (WIPO) (Joo and Kim, 2010). Apart fromthe obvious practical advantages of relying on standardized definitions, decomposing patentsinto their constituent technologies allows to consider inventions as the product of a successfulrecombination of variously related preexisting technologies and knowledge. The heart of patentapplications are the claims, i.e. the part of the patent document that describes the novel aspectsof the invention with respect to the relevant prior art and justifies the request for protectionand, implicitly, delimiting the scope of the pursued monopoly rights. Claims undergo individualexamination by patent office officials and, if approved, are assigned one or more IPC codesrelating to the technologies touched upon by the corresponding claim.The main source of patent (and technology ) data for this paper is the EPO Worldwide PatentStatistical Database (PATSTAT), maintained by the European Patent Office (EPO), whichaggregates data from national and regional patent offices and presents the information in aclean and organized fashion. For example, multiple patent application documents can often bereferred to the same invention. In these cases, PATSTAT collects groups of related documentsinto so-called patent families representing sets of patents filed in more than one country toprotect a single invention by recognizing the link between the first application — the priority— and later ones filed at to other patent offices (Martinez, 2010).As we mentioned in section 2.1, the geographical coverage of AMADEUS is limited to compa-nies with at least one European subsidiary and this could potentially over-represent Europe ingeneral over other areas. In order to mitigate the issue, we take into account only the subset4f PATSTAT families that also responded to the more restrictive criteria defining triadic patentfamilies (Dernis and Khan, 2004), i.e. including an application filed at the EPO, one filed at theJapanese Patent Office (JPO), and one granted by the United States Patent Office (USPTO).The above criteria select globally protected inventions and thus assure that only relatively largeenterprises with global operations are included in the sample. We thus gain, in exchange for somereduction in the number of observations, a more balanced sample that excludes relatively smallEuropean firms patenting only in national offices, which would otherwise be overrepresented.The starting point of the analysis consists in decomposing the patent families with applicationsin a given year into the set of associated IPC codes and attributing the codes to patentingfirms. We then assign each active family a weight equal to one and split it equally among allof the unique technology-company pairs. Every pair maps to an element of a binary matrix M , the value of which is computed based on the sum of the shares of active patent familiesattributed to the corresponding combination of technology and applicant firm (this procedureis fully described in the supplementary material). To summarize, M defines the technologicalportfolio embedded in the patents filed by all active firms in a target year and thus allows us tolook into the structure of such portfolios, which we then relate to corporate efficiency. This section is devoted to presenting some prominent diversification and relatedness measuresproposed in the literature that are relevant to the present inquiry.Though the empirical exercises contained in the present paper are concerned with technologiesand not with products, for historical reasons the discussion of previously existing measureswill always move from contributions addressing corporate productive scope and then extend tothe literature concerning technological scope. In fact, technology has come to prominence morerecently in the literature than production, so that the roots and initial inspiration of most studiescan be traced back to the latter stream. This is not to say that we consider productive scoperelevant only for historical reasons; on the contrary, we argue that, though the technological andproductive dimensions are very different, they are strongly interconnected and they complementeach other to drive the dynamics of firms. The aim of the present section is to put into perspectivethe salient features of the methodology we propose while highlighting its relation to the existing5ontributions. We do not attempt an exhaustive review of the literature , as it would be beyondthe scope of the paper; rather, our aim is to provide a concise overview of some of the indicesof diversification that have been proposed over time and to use it as the starting point to tracethe evolution in the literature that has later led scholars to concentrate also on the close, albeitdistinct, topics of relatedness and coherence, which are the building blocks of the approach wepropose in this paper. Firm diversification and its implications have interested scholars at least since the work of Pen-rose, who has noted that the “firm is not confined to ’given’ products, but the kind of activityit moves into is usually related in some way to its existing resources”, because there are “poolsof unused productive services [. . . which,] together with the changing knowledge of management,create a productive opportunity which is unique for each firm.” (Penrose, 1960). That essaypresented strong evidence from case studies corroborating the intuition that the role played bydiversification in shaping corporate opportunities is substantial. The interest for the topic hasbeen subsequently kept alive by several scholars ( e.g.
Gort (1962); Rumelt (1974)), who have ex-panded upon the issue posed by Penrose and reframed the general problem in quantitative terms,extending the analysis to larger samples of firms from different industries. Early quantitativestudies concerning diversification concentrated mainly on the productive scope of companies,which was measured by accounting for the material inputs or outputs of manufacturing firmsand by grouping activities together based on official industrial classifications – e.g. StandardIndustrial Classification (SIC) codes (Berry, 1971; Gort, 1962) – or on categories developedautonomously by the author based on prior research (Rumelt, 1974, 1982). For example, Gortbuilt his measure of diversification based on the share of company payrolls going to individualmanufacturing activities, while Berry (1971) proposed a measure that summarized the spec-trum of activities of large manufacturing firms based in the USA by measuring the distributionof their output across 4-digit SIC industries and summarizing the vector of output shares witha concentration index inspired by Herfindahl (1950). See e.g.
Knecht (2013) for a comprehensive review of the diversification measures adopted in the economicsand management literature, their implications, and their theoretical foundations and for a review of the measureof coherence.
6n addition to the wealth of theoretical contributions and industry classification attempts spurredby the widespread interest that surrounded corporate product diversification (see e.g.
Mont-gomery, 1994, for an interesting discussion), much empirical work was also devoted to under-stand the relation between firm performance and the number of business activities and marketsentered ( e.g.
Miller, 2004; Palepu, 1985; Palich et al., 2000). However, products are not the onlyarea in which companies show evidence of diversification and it did not escape scholarly atten-tion that the definition of the corporate technological scope is as strategic for businesses as thedecision concerning the number of markets to enter and product lines to bring to market. Thishas been thought to be especially true since the last decades of the twentieth century, which haveseen the emergence of rising complexity in products and production processes ( e.g.
Cohen et al.,2000; Rycroft and Kash, 1999), increasing specialization in knowledge production (Pavitt, 1998)and an accelerated pace of innovation in many industries. This has made “diversity particularlyacross technologies [. . . ] no longer a choice” (Fai, 2004).
One of the first attempts to implicitly account for relatedness alongside diversification was madeby Rumelt (1974) and later refined in an attempt to establish a link between corporate strategyand profitability (Rumelt, 1982). This implied a strong change in perspective with respect toearlier papers, which concentrated on measuring industrial diversification only by the observedbreadth in scope of business activities. Such shift materialized in the growing interest towardsnew tools to explore corporate and industrial evolution based on measuring the distance betweenthe activities in which firms diversify. Rumelt employed a different approach with respect to hispredecessors and abandoned official industry classification codes as a means to define the set ofactivities engaged in by firms. Instead, he focused on a categorical classification (not an index)of diversification strategies he elaborated himself based on the historical observation of a sampleof large US industrial firms. In particular, the author started by assessing the share of revenuesdue to single product lines, he then established the degree of relatedness (“absence or existenceof shared facilities, common selling groups, and other tangible evidence of attempts to exploitcommon factors of production”) between business units , and finally assigned firms to differentcategories according to a composite index accounting simultaneously for product diversification7nd the contribution to the company’s revenues of the largest group of closely related products.According to Rumelt (1982), the importance of addressing relatedness stemmed from the needto test the hypothesis, formulated based on anecdotal evidence from US manufacturing, thatamidst diversified firms “the highest levels of profitability were exhibited by those having astrategy of diversifying primarily into those areas that drew on some common core skill orresource.” Teece et al. (1994) built on the intuition behind Rumelt (1974) and embraced theview that the implications of scope for the evolution of companies and industrial structurecan be better understood by including in the analysis an assessment of the overall coherenceof corporate activities. This approach reflected the more general assumption that the strategicmotives behind diversification should be accounted for in order to build a taxonomy of corporatetypes which, in turn, could be usefully incorporated in a theory of their evolution. To this end,they relied on a much larger sample than their predecessor by resorting to census data aboutUS manufacturing plants. The much larger size of the data forced them to forgo reliance on“adjudged relatedness” and instead employ the standard 4-digit SIC classification as the basisfor the classification of the activities of individual plants. This implied that relatedness neededto be measured instead of assumed or deduced from external information. In particular, Teeceet al. (1994) based the measurement on the survivor principle , i.e. the assumption that economiccompetition eventually drives inefficient organizational forms out of the market, thus promotingthe co-occurrence of activities that are well integrated with one another through the relianceon complementary technological capabilities . The authors argued that, in virtue of the survivorprinciple, the data should reveal efficient combinations of activities to occur with a significantlyhigher frequency than one would expect as a consequence of sheer randomness. Teece et al.(1994) operationalized the above reasoning by first summarizing the activity portfolios of firmsin the binary matrix C ∈ { , } F × P , where non-zero values correspond to the industries in whichfirms operate and then using C to derive the matrix of co-occurrences J ∈ N P × P , such that J Ppp = X f C fp C fp . (1)Subsequently, they derived the significant combinations of activities through a statistic ( τ ) basedon a standard t-test comparing the values of the cells of J P to their expected value under thenull hypothesis of random diversification. Note in passing that we have added the superscript P to the above notation to highlight the fact that the measures refer to products. The statistic τ ,8hich tells “the degree to which the observed linkage between the two industries exceeds thatwhich would be expected were the assignments of industries to companies simply random andleads to define”, allowed the authors to define a measure of coherence between activities forindividual multi-activity firms called weighted average relatedness W AR p = P p = p τ pp e p P p = p e p , (2)which evaluates the relatedness of each activity p of a production plant to all its other activities p = p as the average significance of their co-occurrences weighted by the share e p of employeesworking in activity p . A connected measure of relatedness presented in the same paper is the weighted average relatedness to neighbors (WARN), which measures the strength of associationbetween an activities similarly to WAR, but focuses only on the links between closest neighborsdefined by computing the maximum spanning tree associated to J P . Teece et al. have shownthat, as firm scope increases, WAR tends to fall – meaning that the average distance betweenall the activities of a multi-activity firm grows with diversification – while WARN tends torise – which indicates that the link between more highly related activities grows stronger. Theauthors thus conclude that coherence is important, since the results about WAR and WARNjointly suggest that “if firms grow more diverse, they add activities that relate to some portionof existing activities” (Teece et al., 1994). As is the case for diversification, corporate coherence has originally found application in theproduct domain, but it has been shown to be extremely meaningful in studying firm performanceand evolution from a technological viewpoint (Engelsman and van Raan, 1994; Joo and Kim,2010; Leten et al., 2007; Piscitello, 2000; Rigby, 2015). Of course, the meaning of coherence fromthe viewpoint of technological innovation is different from productive coherence. Nevertheless,the concepts are also complementary in understanding firm evolution, so it is not surprisingthat scholars interested in technological coherence have borrowed from the toolbox of theirpredecessors, who had previously addressed products. In a well-known study, Breschi et al.(2003) have recovered the contribution of Teece et al. (1994) and built on the methodologyproposed therein to investigate whether firms tend to diversify their innovative efforts in a9oherent fashion by patenting in technological fields that share a common knowledge base withthe technological fields in which they innovated in the past. In analogy with Teece et al. (1994),Breschi et al. (2003) have analyzed the technological diversification of firms employing a matrixof co-occurrences J tt = X f M ft M ft (3)akin to matrix J P of equation 1 and rejected the null hypothesis of random diversificationthrough the statistic τ , which compares the number of observed co-occurrences between tech-nologies with the expectation under a hypergeometric distribution.Notice that matrix J above iscomputed from the binary matrix M , which is similar to matrix C of Teece et al. (1994) with thedifference that it associates non-zero values to the (firm, technology code) pair corresponding toposition M ft whenever firm f holds a patent in field t .In another interesting paper, Nesta and Saviotti (2006) have studied corporate knowledge coher-ence in the US pharmaceutical industry and show that both the scope and the coherence of theknowledge base “contribute positively and significantly to the firm’s innovative performance”,as measured by the number of patents it produces weighted by the number of citations received.The authors further built on the work of Breschi et al. (2003) and adapted it to define a mea-sure of knowledge coherence internal to the firm. In particular, they employed the technologicalcounterpart of WAR defined in equation 2 where the relatedness between technological fields isnow weighted by the share p t of patents per field t owned by the firm. This allowed them todefine firm knowledge coherence as COH = X t p t P t p t W AR t ! . (4)Regressing COH against the R&D output of the firms in their data sample – as measured by thenumber of owned patents weighted by the number of citations – they found that both knowledgecoherence and knowledge scope have a positive impact on the dependent variable.10 a) Capabilities mediate between countries andtheir export baskets (b)
Since capabilities are unobservable, their rolemust be inferred from the bipartite network con-necting countries to products
Figure 1: The role of intangible capabilities.
The intuition behind equation 1 is also central in the literature on economic complexity, whichin recent years has sought to explain the composition and evolution of the export baskets ofnations engaging in international trade (e.g. the product space of Hidalgo et al. (2007) andthe taxonomy network of Zaccaria et al. (2014)) as well as elaborate reliable predictions of thefuture growth trajectories of national economies (Hidalgo et al., 2007; Tacchella et al., 2012). Theassumption underlying all the above contributions is that the patterns of competitive advantagerevealed by national export baskets are the result of intangible country-level capabilities (seefigure 1), which must be possessed and combined effectively in order to acquire the necessarystrength to thrive in global competition. These approaches aim to reconstruct a network linkingproducts based on the similarity of the sets of capabilities required to produce them efficiently.Consequently, if a nation alone has a revealed competitive advantage in exporting a given good,one can infer that the nation in question possesses the needed combination of capabilities. Tothis end, one can define a binary matrix in which the generic element M cp takes value one ifcountry c has a revealed comparative advantage (Balassa, 1965) in exporting product p . Themeasure of proximity φ , which is key in defining the product space of Hidalgo et al. (2007), reads φ pp = min (cid:18) P c M cp M cp u p , P c M cp M cp u p (cid:19) , (5)where u p ≡ P c M cp is the ubiquity of product p , i.e the number of countries which export it.The proximity represents the empirical counterpart of the conditioned probability to export aproduct, given the export of another product. 11imilarly, the formula for the taxonomy network proposed by Zaccaria et al. (2014) can berepresented in terms of M by defining the matrix B ∈ N P × P B pp = 1max( u p , u p ) X c M cp M cp d c , (6)where d c ≡ P p M cp is the diversification of country c , i.e the number of products it exports.From a probabilistic point of view, here the frequency of a product’s occurrence is not onlyconditioned to the presence of another product but also evaluated with respect to a randombinomial case, which would have an frequency equal to d c /P (the constant factor P is usuallyneglected). Equation 6 can be also interpreted, following Zhou et al. (2007), as the probabilityto go from a product to the other performing a random walk defined on the tripartite product-country-product network.It is worth noting that, within the general framework of relatedness and corporate coherence,the motives guiding the authors of different contributions have been quite heterogeneous. Onone hand, the prevalent aim of some studies has been to define a taxonomy of corporate diversi-fication strategies and organizational structures from which to deduce the dynamic properties ofindustries (Bottazzi and Pirino, 2010; Teece et al., 1994). On the other hand, a related stream ofarticles has focused on the nexus between differentiation, coherence, and performance (Breschiet al., 2003; Nesta and Saviotti, 2006; Rumelt, 1982) with the aim of inferring the implications ofdiversification strategies for performance at the micro level. The present paper adds to the broadliterature on corporate coherence in technological portfolios by proposing a firm-level measureof coherent diversification that is inspired to the capabilities-based approach proposed by Teeceet al. (1994) and uses the formulation introduced by Zaccaria et al. (2014). The contribution of this paper lies at the intersection between the literature on corporate co-herence (especially Nesta and Saviotti, 2006) and the contributions to the economic complexityliterature. In particular, our aim is to transpose the definition of relatedness proposed by Zac-caria et al. (2014) at the firm level and apply their proposed measure (equation 6) to corporatepatent portfolios in order to uncover the structure of the underlying technology space . This will12 igure 2: Corporate technological portfolios conceal information about feasibleoutput baskets.
Modeling technological portfolios to get a glimpse of the structureunderlying product baskets is an operationally similar task to the one undertaken byscholars who aim to understand the relevance of intangible capabilities from the com-position of the output mix produced by agents (see figure 2). Notice however that thetwo tasks are conceptually different as suggested by the fact that in figure 1 capabilitiesare the actual mediators between economic agents and their output, while here productsare the hidden layer. serve as a stepping stone to elaborate a measure of the coherent technological diversification ofthe firms in our dataset, which we can use to examine the relation it bears with performance.Finally, the analysis we propose also follows in the footsteps of Teece et al. (1994) in that we ana-lyze the relevance of relatedness in firm diversification by building on the premise that firm-leveltechnological capabilities are the drivers of successful diversification. At the same time, it differsfrom Teece et al. (1994) in two ways: i) it is centered on the analysis of corporate technologyportfolios arising from of capabilities related to the production of technological advancementinstead of focusing on the traditional notion of capabilities, which concern the know-how in-volved in the creation and development of products, and ii) adopts the different formulation ofco-occurencies used in Zaccaria et al. (2014).As illustrated by figure 2 and its comparison to figure 1, modeling technological portfolios to geta glimpse of the structure underlying product baskets is an operationally similar task to the oneundertaken by scholars who aim to understand the relevance of intangible capabilities from thecomposition of the output mix produced by agents. Conceptually, however, the two endeavors13 a) Matrix M . The circles represent the firmsand the triangles represent the technological fieldscomprising their technological portfolios. (b) Matrix B . The triangular nodes in the graphcorrespond to a technological fields and are col-ored to highlight proximity to the more frequentlyco-occurring (thus more related) technologies heyare connected to. Figure 3: Basic Matrices. are quite different. Notice in fact that in figure 1 capabilities are the actual mediators betweeneconomic agents and their output. Figure 2 instead depicts products as the hidden layer, but itwould be wrong to deduce from that products mediate between agents and technological fields,because it would amount to assume that production is instrumental to R&D, while the relationclearly must go in the opposite direction. Moreover, figure 2 illustrates our view of the possiblecoherent structure of technological portfolios. The company on the left produces computers andsmartphones. Some of the underlying technologies are used only for a specific products, andwe color them with the same color (orange or light blue). However, since the two products arehighly related from a technological point of view, we can assume that many of the capabilitiesneeded to manufacture one product will be useful also for the other product. This situationwill characterize also patenting activities, where technological fields will sometimes be sharedbetween product lines (see the striped technologies in figure 2). In this view, the coherentcompany par excellence is the one in the middle of figure 2, which is specialized in a singleproduct and thus needs only the technologies related to the business activity. The companyon the right, on the contrary, produces unrelated products; this will result in an incoherenttechnological portfolio. In what follows, we will test if the performance of a company is relatednot only to the diversification but also to the coherence of its technological capabilities. To thisend we will introduce a formula to represent such features of the technological portfolios and14 igure 4: Minimum spanning tree of B . The nodes in the graph represent IPCsubsections and are colored according to the section they belong to. Each node isconnected to the technological field which is linked to it with the highest weight. quantify the so-called coherent diversification .The basic data we need to define coherent diversification in corporate technological portfoliosis the matrix M defined in section 2.2 and further discussed in section 3.2. This matrix repre-sents a bipartite network linking companies to the technological fields in which they are activeinnovators. For this study we perform a yearly analysis and select for each period the triadicpatent families in which the firms have a stake as owners. We point out that the specific resultspresented below refer to the data for 2011, the most recent year for which we trust the data cov-erage to be reasonably complete; the results are however robust and hold also for previous timeperiods. A stylized graphical representation of the bipartite companies-technologies network,whose adjacency matrix is M , is depicted in figure 3a.In order to define coherent diversification, we first need a measure of technological relatedness.To this end, we use the matrix B of equation 6, which we redefine to account for firm and15 igure 5: Illustration of γ for a generic technology t and two firms (1 and 2),depicted respectively in the left and right panels. In both panels, the graph representsthe binary B of figure 3b: the opaque triangles stand for technological fields in whichthe associated firm holds patents. Both firms are diversified in the same number oftechnological fields. However, those of the firm 1 are connected within B forming aunique block; on he contrary those of the firm 2 are scattered through the graph. As aconsequence, technology t highly coherent in firm 1 and not in firm 2. technologies, obtaining B tt = 1max( u t , u t ) X f M ft M ft d f . (7)The matrix B can be interpreted as the adjacency matrix of a monopartite network of technolo-gies like the one represented in figure 3b. Each of the triangular nodes in the graph corresponds toa technological field and is colored to highlight its proximity to the more frequently co-occurring(thus more related) technologies to which it is linked. The figure shows that B has embeds thenotion that specific combinations of technologies concur to generate products, even though itis not possible to establish the correspondence between the technology and the production do-mains. Moreover, B depicts the technological space as a whole, but holds no information aboutthe firms (the circles in figure 3a representing the matrix M ) whose technological portfolios wereused to compute it.Figure 4 shows the minimum spanning tree defined from the empirical data used to compute B ; by construction, each node represents a technological field and is connected field with whichit shares the heaviest link. The nodes in the graph represent IPC subsections and are colored16 igure 6: Illustration of Γ , which can be interpreted as a reweighing of the diversifica-tion structure of firms. Γ highlights the blocks of connected technologies. In principle, ithas a correspondence with the corporate product basket, however the information aboutthe map connecting what a firm knows and what it produces remains hidden beneaththe surface. according to the section they belong to. The color pattern of the graph highlights a tendencyof broad technological fields to connect with similar ones much less frequently than to relativelydistant ones, which suggests that mixing of different fields is far from rare.In order to combine the general structure of technology relatedness with firm-specific informa-tion, we first need to measure for each company the coherence between all of the technologies inwhich it holds patents. Figure 5 qualitatively illustrates such measure for a generic technology t and two toy – 1 and 2 – depicted respectively in the left and right panels. In both panels, thenetwork structure connecting the triangles in the background represents a simplified (binary)illustration of B ; the opaque triangles stand for technological fields contained in the patentportfolio of each firm, while the transparent triangles represent technological fields in whichthe same firm has not filed patents during the time period covered by M . Notice that bothfirms are equally diversified because they both have patents covering the same number (eight)of technological fields. The glaring difference between firm 1 and firm 2 resides in their diversi-fication structures. In particular, the technological fields of the first company are all connected17ithin B and form a unique block, while the technologies of the second are scattered in B . Asa consequence, technology t , which is owned by both firms, has a high intra-firm coherencewithin firm 1, but attains a low score in firm 2. In reality, the linkages we measure at each stepof the analysis between companies and technology fields are not binary but, rather, weightedand we must keep this into account in the analytical definition of coherence, that will depend onboth technology i and the surrounding technological basket of firm f . We define the intra-firmcoherence of technologies by the rectangular matrix γ ∈ N F × T where γ ft = X t B tt M ft , (8)the analytical counterpart of figure 5.Finally, it is possible to define an index of corporate technological coherence, that we call coherentdiversification , by aggregating within each firm the information about the intra-firm coherenceof all the technological fields in which it holds patents. As schematically represented by figure 6,this can be interpreted as a reweighing of the diversification structure of firms, which highlightsthe connected technologies and in principle has a correspondence with the corporate productbasket, though the information about the explicit map connecting what a firm knows and whatit produces remains under the surface.In formula, we define the firm-specific index of coherent diversification Γ ∈ R F asΓ f = P t M ft γ f t d f , (9)where d f ≡ P t M ft is the diversification of firm f . In practice, Γ computes the average size ofthe coherent technology blocks comprising the technological portfolio of each company. A simple example will help explain how this framework rewards diversification only if it de-fines a coherent portfolio. Suppose that company f owns two close technologies, such that B tt = 1 ∀ ( t, t ) = { , } . Straightforward calculations lead to Γ f = 1 + 1 = 2: the coherentdiversification is equal to the standard definition of diversification. On the contrary, if these two18 igure 7: Toy example. A graphical representation of the matrices B from M relativeto the toy example discussed in the text. technologies are not related, B tt = δ tt , and we will have Γ f = (1 + 1) = 1: in this case thelack of coherence averages out the diversification.In order to further clarify the economic meaning of the coherent diversification and its relationwith production lines, we present a simple calculation based on the illustrations shown in thissection. Let us start from the situation depicted in figure 2. We have three companies: thefirst one (company x) has two production lines (computers and smartphones) and its portfoliocontains eight technologies, of which three are purely related to computers, three are necessaryfor smartphones, and two are useful for both products; the second company, y, is instead spe-cialized in cars and controls three technologies related to this production line; finally, the thirdcompany, z, has two unrelated production lines, computers and cars, relying respectively ongroups of three and two technologies. The associated M matrix is depicted at the top right offigure 7. In order to compute the coherence of these technological portfolios we need a measureof distance, B . In this example, we do not compute B from M like we will do for the real case;on the contrary, we suppose that the three company live in a technological space defined byother companies that are not individually included in the example. In particular, we take thetechnological network depicted in 3b, whose adjacency matrix is represented in the top left of fig-ure 7. The technologies related to cars are homogeneous (i.e., fully connected) and independentfrom technologies used for their production lines (i.e., there are no off diagonal elements con-19ecting them to other technologies), forming a single unitary block. On the contrary, computerand smartphone technologies are homogeneous but mildly related through the two off-diagonaltechnologies (the fourth and the fifth in the first row of M ). Note that here we have a binarymatrix, but in general the elements of B can have any value.Let us now compute the intra-firm coherence of technologies, that is, the enhancement technology t gets thanks to the fact of being in the portfolio of company f . Applying equation 8 we obtainthe bottom matrix of figure 7. In this simple case, the matrix just counts the neighbors of atechnology that are owned by the company. Notice that the block of car technologies is morecoherent in firm y than in firm z, since they own 3 and 2 technologies in that block respectively.Finally, using equation 9, we can compute the coherent diversification of the three companies.For company y we obtain Γ = 3. In this simple case, the coherent diversification is simply theaverage number of technologies used for each production line. Such interpretation is a zero orderapproximation, which turns to be exact only for independent and homogeneous production lines.Let us now consider company x. In this case, the enhancement due to the close technologies isstronger, as one can notice looking at the first row of the γ matrix; averaging over the ownedtechnologies, one obtains Γ = 3 .
5. Finally, company z has Γ = 2 .
6. This can be interpreted asa weighted average over the production lines: the first production line (computers) has threetechnologies, all with an intra-firm coherence equal to three, while the second production line(cars) can use only two technologies, and this implies a lower coherence, equal to two. In orderto compute Γ we weigh the coherences with the relative number of technologies used for eachproduct: × × . We now test the measure of firm coherence Γ defined in the previous section 4 by correlating itwith an index of firm efficiency. In fact, if our hypothesis that innovating in related technologicalfields is conducive to the development of an effective mix of firm-level capabilities, which is inturn reflected in production, then this should correlate with firm performance.The first test is illustrated in figure 8, which plots the binned values of Γ against the intra-bin quantiles of labor productivity (measured as value added over employees) for the firms inour sample. The plot shows a clear positive association, providing preliminary evidence that20 igure 8: Coherent diversification VS labor productivity.
The graph plots thebinned values of related diversification ( Γ ) of the firms in our sample against the intra-bin quantiles of labor productivity. The clear positive association between Γ and laborproductivity suggests that coherent diversification of technological portfolios capturesrelevant information about the corporate productive structure. our measure of coherent diversification of technological portfolios captures relevant informationabout the productive structure of the firms. As a further test of the ability of Γ to capture arelevant aspect of corporate productive efficiency, we regress it against labor productivity. Theresults of the least squares regressions, which are summarized in table 1, further confirm theintuition conveyed by figure 8. The coefficient associated to coherent diversification remainspositive and significant in all regressions, even when we add firm size (measured by total as-sets) and diversification ( i.e. the number of technology codes in the firm’s patent portfolio) ascontrols. Moreover, though simple diversification is statistically significant if used alone, it losesexplanatory power when used in the same model as Γ. This is particularly interesting, because itsuggests that the number of connected technologies within a company’s technological knowledgeportfolio, whose relation is quantified by our measure of coherence, is more relevant than theraw number of technological fields in which the company innovates. In particular, the fact that21he statistical significance of diversification – as measured by the number of technologies com-prising firm technological portfolios – vanishes once coherent diversification is added to the setof regressors suggests that the former can be considered a proxy for the latter. Our findings thussuggest that what firms know is relevant to what they produce and that the internal consistencyof their knowledge stock is even more relevant than its the sheer scope.VARIABLES ( 0 ) ( 1 ) ( 2 ) ( 3 )Size 0.079*** 0.079*** 0.081*** (0.023) (0.008) (0.008) Diversification 0.010 0.074*** (0.045) (0.009)
Coherent Div. 0.136*** 0.154*** 0.200*** (0.045) (0.017) (0.016) R Table 1:
Regressions of labor productivity against coherent diversification, diversifica-tion, and size
The results shown in 1 can be represented by means of a two dimensional plot, in which weconsider labor productivity as a function of both diversification and coherent diversification. Infigure 9 we use these two variables to aggregate the firms into square areas colored based ontheir ranking in labor productivity. As expected, there is a strong correlation between coherentand standard diversification, which leads to the presence of white (empty) squares away fromthe main diagonal. More interestingly, coherent diversification has more explanatory power withrespect to the standard diversification: on average, horizontal slices exhibit a stronger gradientin labor productivity than vertical slices.We conclude this section by analyzing the role played by firms size. Similarly to the previousexhibit, in Fig.10 we plot labor productivity as a function of size and coherent diversification.The two variables are clearly complementary: on average, large size or large coherent diversifi-cation are associated with larger labor productivity, and the same holds for linear combinationsof the two. Obviously, this is true on average, and a large degree of heterogeneity is present.However, the comparison of figure 10 with figure 9 allows us to conclude that the effect onlabor productivity of standard diversification depends on its correlation with size and coherentdiversification, which thus represents a better framework to discuss the effects of the structureof technological corporate portfolios on firms performance.22 igure 9: Labor productivity as a function of Diversification and Coherent Diver-sification.
A graphical representation of what we pointed out in Table 1: diversificationloses its explanatory power in favor of coherent diversification when both are considered.Notice that, given a fixed value of diversification, labor productivity tends to increasewith coherent diversification (i.e., from left to right, considering horizontal slices), whilethe opposite does not hold.
In this section we briefly discuss the relevance of data resolution for the results of the empiricalanalysis. Even though in this paper we have dealt with corporate patent portfolios, and havethus focused our attention into individual economic agents instead of geographical regions, thereis still room for the scaling of the data to have an effect on the results. The most straightfor-ward way through which one could explore this effect would be to vary the coarseness of thetechnological classification employed to define M . Though this exercise might lead to interestingexplorations, we would like to concentrate on a more subtle channel through which the effectcan be transmitted, namely the geographical scale used to define B . Notice in fact that when B enters equation 8 its size must match the number of columns of M , i.e. it must represent23 igure 10: Labor productivity as a function of Size and Coherent Diversification. Coherent diversification gives complementary information about firms performance. relatedness at the same technological scale at which the technological portfolios of the firms aredefined.However, there is no constraint on the geographical aggregation of the matrix from which B isdefined. It is in fact possible to substitute M with another matrix in which firms are aggregatedbased on the country or region in which they have seat and use it to compute a new relatednessmatrix ˜ B , in which the starting geographical aggregation can be arbitrarily coarse.Figure 11 shows that indeed the geographical scale used to define the global relatedness oftechnological fields has a deep influence on the observed global relatedness between technologies.In fact, there is a striking difference between the right and the left panel, which representrespectively B from equation 7 and ˜ B computed from ˜ M . We point out that we used theIPC classification to order rows and columns. Notice that the same technological codes aremuch more clustered in the latter case (country-based aggregation) than in the former (in whichwe work at the firm level). This finding is not surprising, given that defining technologicalrelatedness at the national level means considering the technological portfolios of extremely24 igure 11: The structure of the relatedness matrix B changes if it is built from dataaggregated at different geographical levels. The left panel represents ˜ B computed froman aggregated version of M – say ˜ M – in which rows not longer index individual firms,but rather the nations in which such firms reside. The right panel represents B fromequation 7, the relatedness measure used throughout the paper. The same technologicalcodes are much more clustered in the former case than in the latter. differentiated entities, which, by definition, can explore a much larger set of combinations andhence better highlight the true global relatedness structure between technological fields. Onthe other hand, as large as individual companies can be, they are necessarily constrained in thebreadth of their output basket and their reference market, and as a consequence they will alsobe limited to the development of combinations of technological capabilities needed to effectivelyproduce that relatively narrow set of goods and services. The question however remains as tohow different definitions of B affect the correlation between Γ and specific firm characteristicsand, eventually, whether an optimal geographical scale exists at which to define the globaltechnological relatedness matrix.Table 2, which summarizes the regression of labor productivity against coherent diversificationmeasured based on ˜ B , provides evidence in line with the hypothesis that geographical aggrega-tion actually has an influence and that measuring diversification at the company level is a moresuited starting point to measure firm-level coherent diversification with respect to a more ag-gregated definition. Notice that, as in table 1, coherent diversification is statistically significantin all settings and also explains productivity better than diversification. However, the explana-tory power as measured by the R is sensibly lower in the regressions of table 2, suggestingthat coherent diversification does not work as well if defined at a coarser geographical scale.25ARIABLES ( 0 ) ( 1 ) ( 2 ) ( 3 )Size 0.082*** 0.081*** 0.082*** (0.056) (0.009) (0.009) Diversification -0.047 0.068*** (0.071) (0.010)
Coherent Div. 0.148** 0.089*** 0.121*** (0.071) (0.013) (0.013) R Table 2:
Regressions of labor productivity against country-level coherent diversification,diversification, and size
This finding has clear implications for those studies that apply country-based measures such asHidalgo et al. (2007) and Zaccaria et al. (2014) directly to firm-level data.
In this work we have presented a quantitative assessment of the relationship between corporatetechnological portfolios and their performance. The idea is that successful companies shape theirpatenting activity on the basis of well defined production lines, and that this strategic behaviorcan be traced by looking at corporate technological portfolios. In particular, we introduce amethodology to reconstruct an estimate of both the size and number of the coherent blocks ofknowledge a firm owns, and we show that their average size is correlated with firms performance.From a practical point of view, we have used a database of about 70 thousand firms, includingtheir patenting activity, in order to define a bipartite companies-technologies network. A link ispresent if a firm is active in a given technological field, as reported by codes in their submittedpatents. Then, we have built a monopartite network of technological codes by applying a measureof relatedness initially conceived to uncover the common capabilities that countries should haveto export a pair of products. This network can be used to assess the relative integration oftechnological activities within a firm. The result is the so called coherent diversification, aweighted average of the relatedness of a firm’s technologies, which can be seen as a proxy ofthe average size (in terms of technological fields) of a firms’ coherent blocks of knowledge. Wehave found that the coherent diversification explains firms’ performance, as measured by laborproductivity, in a statistically significant way, and even if standard diversification and size are26sed as controls. This finding has remarkable practical consequences: for instance, it points outthat coherent diversification, and not diversification by itself, should be taken into account inmerging or acquisitions among companies. Finally, we have presented a comparative analysisof the structure of the technological space when countries, and not firms, are considered aspatenting entities. We have found that a better explanatory power of firms’ performance can beobtained if the aggregation is made at firm level, which therefore represents a more representativescale for this kind of studies.This work opens up a number of possible further studies. For instance, in our analysis theproduction lines represent a hidden layer that can be proxied by pinpointing coherent blocksin corporate technological portfolios. When one analyzes directly products, these blocks shouldclearly emerge, giving rise to well defined clusters possibly in agreement with the standardclassification - while this could be not true for technological fields. The study of the differentclustering behavior of product and technologies will be the subject of a forthcoming paper.27 eferences
Balassa, B. (1965). Trade liberalisation and ”revealed” comparative advantage.
The ManchesterSchool 33 (2), 99–123.Berry, C. H. (1971). Corporate growth and diversification.
The Journal of Law & Eco-nomics 14 (2), 371–383.Bottazzi, G. and D. Pirino (2010). Measuring industry relatedness and corporate coherence.
Available at SSRN 1831479 .Breschi, S., F. Lissoni, and F. Malerba (2003). Knowledge-relatedness in firm technologicaldiversification.
Research Policy 32 (1), 69–87.Cohen, W. M., R. R. Nelson, and J. P. Walsh (2000). Protecting their intellectual assets:Appropriability conditions and why us manufacturing firms patent (or not). Technical report,National Bureau of Economic Research.Dernis, H. and M. Khan (2004). Triadic patent families methodology.Dosi, G., M. Faillo, and L. Marengo (2008). Organizational capabilities, patterns of knowledgeaccumulation and governance structures in business firms: an introduction.
OrganizationStudies 29 (8-9), 1165–1185.Engelsman, E. C. and A. F. van Raan (1994). A patent-based cartography of technology.
Research Policy 23 (1), 1–26.Fai, F. (2004). Technological diversification, its relation to product diversification and theorganisation of the firm.
Unitversity of Bath School of Management Working Paper Series .Gort, M. (1962).
Diversification and Integration in American Industry . National Bureau ofEconomic Research, Inc.Granstrand, O., P. Patel, and K. Pavitt (1997). Multi-technology corporations: Why TheyHave Distributed Rather Than Distinctive Core Competencies.
California Management Re-view 39 (4), 8–25.Griliches, Z. (1990). Patent statistics as economic indicators: a survey. Technical report, NationalBureau of Economic Research.Hall, B. H., A. B. Jaffe, and M. Trajtenberg (2001). The NBER patent citation data file:Lessons, insights and methodological tools. Technical report, National Bureau of EconomicResearch. 28ausmann, R. and C. A. Hidalgo (2011). The network structure of economic output.
Journalof Economic Growth 16 (4), 309–342.Herfindahl, O. C. (1950).
Concentration in the steel industry.
Ph. D. thesis, Columbia University.Hidalgo, C. A., B. Klinger, A.-L. Barab´asi, and R. Hausmann (2007). The product spaceconditions the development of nations.
Science 317 (5837), 482–487.Joo, S. H. and Y. Kim (2010). Measuring relatedness between technological fields.
Scientomet-rics 83 (2), 435–454.Knecht, M. (2013).
Diversification, Industry Dynamism, and Economic Performance: TheImpact of Dynamic-related Diversification on the Multi-business Firm . Springer Science &Business Media.Leten, B., R. Belderbos, and B. Van Looy (2007). Technological diversification, coherence, andperformance of firms.
Journal of Product Innovation Management 24 (6), 567–579.Martinez, C. (2010). Insight into different types of patent families.Miller, D. J. (2004). Firms’ technological resources and the performance effects of diversification:a longitudinal study.
Strategic Management Journal 25 (11), 1097–1119.Montgomery, C. A. (1994). Corporate diversification.
The Journal of Economic Perspec-tives 8 (3), 163–178.Nesta, L. and P.-P. Saviotti (2006). Firm knowledge and market value in biotechnology.
Indus-trial and Corporate Change 15 (4), 625–652.Palepu, K. (1985). Diversification strategy, profit performance and the entropy measure.
Strate-gic management journal 6 (3), 239–255.Palich, L. E., L. B. Cardinal, and C. C. Miller (2000). Curvilinearity in the diversification–performance linkage: an examination of over three decades of research.
Strategic managementjournal 21 (2), 155–174.Pavitt, K. (1998). Technologies, products and organization in the innovating firm: what adamsmith tells us and joseph schumpeter doesn’t.
Industrial and Corporate change 7 (3), 433–452.Penrose, E. (1959).
The theory of the growth of the firm . New York: John Wiley.Penrose, E. T. (1960). The growth of the firm—a case study: the Hercules Powder Company.
Business History Review 34 (01), 1–23.Piscitello, L. (2000). Relatedness and coherence in technological and product diversification ofthe world’s largest firms.
Structural Change and Economic Dynamics 11 (3), 295–315.29ibeiro, S. P., S. Menghinello, and K. De Backer (2010). The OECD ORBIS database.Rigby, D. L. (2015). Technological relatedness and knowledge space: entry and exit of US citiesfrom patent classes.
Regional Studies 49 (11), 1922–1937.Rumelt, R. P. (1974). Strategy, structure, and economic performance.Rumelt, R. P. (1982). Diversification strategy and profitability.
Strategic management jour-nal 3 (4), 359–369.Rycroft, R. W. and D. E. Kash (1999).
The complexity challenge: Technological innovation forthe 21st century . Cengage Learning EMEA.Strumsky, D., J. Lobo, and S. Van der Leeuw (2012). Using patent technology codes to studytechnological change.
Economics of Innovation and New technology 21 (3), 267–286.Tacchella, A., M. Cristelli, G. Caldarelli, A. Gabrielli, and L. Pietronero (2012). A new metricsfor countries’ fitness and products’ complexity.
Scientific reports 2 .Teece, D. J., R. Rumelt, G. Dosi, and S. Winter (1994). Understanding corporate coherence.
Journal of Economic Behavior & Organization 23 (1), 1 – 30.Zaccaria, A., M. Cristelli, A. Tacchella, and L. Pietronero (2014, 12). How the taxonomy ofproducts drives the economic development of countries.
PLoS ONE 9 (12), 1–17.Zhou, T., J. Ren, M. Medo, and Y.-C. Zhang (2007). Bipartite network projection and personalrecommendation.