Dynamic change of gene-to-gene regulatory networks in response to SARS-CoV-2 infection
Yoshihisa Tanaka, Kako Higashihara, Mai Adachi Nakazawa, Fumiyoshi Yamashita, Yoshinori Tamada, Yasushi Okuno
DDynamic change of gene-to-gene regulatory networks in response toSARS-CoV-2 infection
Yoshihisa Tanaka , Kako Higashihara , Mai Adachi Nakazawa , Fumiyoshi Yamashita , Yoshinori Tamada (cid:0) , andYasushi Okuno (cid:0) Graduate School of Pharmaceutical Sciences, Kyoto University RIKEN Cluster for Science, Technology and Innovation Hub, Medical Sciences Innovation Hub Program Graduate School of Medicine, Kyoto University
Abstract
The current pandemic of SARS-CoV-2 has caused extensivedamage to society. The characterization of SARS-CoV-2 prpfileshas been addressed by researchers globally with the aim ofresolving this disruptive crisis. This investigation processis indispensable for an understanding of how SARS-CoV-2behaves in human host cells. However, little is known aboutthe systematic molecular mechanisms involved in the effectof SARS-CoV-2 infection on human host cells. Here, wehave presented gene-to-gene regulatory networks in response toSARS-CoV-2 using a Bayesian network model. We examinedthe dynamic changes of the SARS-CoV-2-purturbated networksestablished by our proposed framework for gene networkanalysis, revealing that interferon signaling gradually switchesto the subsequent inflammatory-cytokine signaling cascades.Furthermore, we have succeeded in capturing a COVID-19patient-specific network in which transduction of these signal-ings is coincidently induced. This enabled us to explore localregulatory systems influenced by SARS-CoV-2 in host cells moreprecisely at an individual level. Our panel of network analyseshas provided new insight into SARS-CoV-2 research from theperspective of cellular systems.
SARS-CoV-2 | gene network analysis | COVID-19 individual networkCorrespondence: [email protected],[email protected]
Introduction
The newly emerging coronavirus, severe acute respiratorysyndrome coronavirus 2 (SARS-CoV-2), has spread rapidlyover the world (1, 2), with more than 22,000,000 cases ofcoronavirus disease 2019 (COVID-19) and 790,000 deathsas of August 21, 2020 (3). This pandemic outbreak hasdrastically changed our society, and has compelled us to stayalert to the continuous risk of SARS-CoV-2 infection (4). Toovercome this dire situation, the development of novel drugsor vaccines is still an urgent global challenge. During thetherapeutic development process, the elucidation of cellularmechanisms is essential for the discovery of potential targets;the fundamental question to be solved is how SARS-CoV-2influences host cells and causes COVID-19 at the molecularlevel. However, these cellular mechanisms for COVID-19 arepoorly understood.High-throughput technologies have contributed to the ac-quisition of a large amount of “omics” data, which hasprovided comprehensive information on the cellular systems.These technologies have also featured during the currentresearch into SARS-CoV-2. Several reports have provided various clues to understanding the global cellular signaturesin response to SARS-CoV-2 infection at both the proteomeand transcriptome level (5–7). Recently, network-basedapproaches have stimulated great interest in the use ofthese emerging omics data for drug discovery and systemsbiological analysis in the current SARS-CoV-2 field (8–13).Their major approaches combine publicly available sources,including knowledge of the already established pathwaysand drugs with these omics data to reconstruct molecularnetworks. However, these networks do not sufficientlyrepresent a real-world cellular system owing to two mainreasons: 1) public data consist of heterogeneous knowledgethat has been accumulated throughout the longstandingbiological researches; and 2) the previous works use mixednetworks that combine data from various samples, but cannotreflect an individual cell-/patient- specific network.To address these problems, we recently developed a methodto extract a core sample-specific network from a massivegene network generated from a Bayesian network (14).Gene regulatory network estimation has been developed asa prospective method to model the cellular system usingomics data (15–19). Although Bayesian network-based ap-proach can infer the cause-and-effect relationships betweengenes with transcriptome data, the key issue has been toextract biologically significant information from the huge andcomplicated network, which is often sarcastically referredto as a hairball (20). Our unique framework consistsof three steps: 1) estimation of a global gene network;2) extraction of context-specific core networks based ondifferences in molecular systems from the global network;and 3) identification of a sample or patient specific network(Fig. 1). The prominent advantage is that it enables us toidentify putative context-specific or sample-specific potentialsets of edges in the form of a network, i.e., gene-to-generelationships with directions, as well as nodes.In this study, by using our developed framework for genenetwork analysis, we have presented the core host cellularsystems involved in SARS-CoV-2 over several in vitro experiments; different viral loads, cell lines, and respi-ratory viruses. No studies have been performed on thecomputational data-driven gene regulatory network approachregarding SARS-CoV-2. We characterized interferon sig-naling and subsequent inflammatory signaling cascades assignificantly changed networks in human host cells, whichrepresent the innate antiviruses-immune system in responseto SARS-CoV-2 infection. In addition, given that the recentstudies have reported that patients with COVID-19 exhibit
Tanaka et al. | arXiv | August 24, 2020 | 1–15 a r X i v : . [ q - b i o . M N ] A ug OVID-19 patienthealthy 1healthy 2
STEP 2Individual network extraction - COVID-19 patient- healthy 1- healthy 2
COVID-19 biopsy RNA-seq ... - SARS-CoV-2- IAV- HPIV3- RSV respiratory viruses RNA-seqSTEP 1Bayesian network estimationbasal network STEP 3Individual network identification
Fig. 1.
Illustration of overview. The hairball (blue) is the basal network consisting of 127,126 edges and 15,258 nodes established by use of the respiratory viruses RNA-seqincluding SARS-CoV-2. The highlighted-network (magenta) in the basal network represents the COVID-19-perturbated network extracted by using the biopsy RNA-seq. various clinical outcomes depending on each patient, andthat a certain proportion of patients will experience a severedisease (21–23), it is much more important to reveal thecellular mechanisms causing these clinical symptoms at anindividual level. To this end, we have further identified thegene networks specifically for patients with COVID-19. Webelieve that our landscape of gene networks is beneficial toachieve an understanding of how cellular systems respond toSARS-CoV-2 and to further drug development.
Results
Estimation of basal gene network in the involvementof respiratory viruses infection using a Bayesiannetwork.
We first characterized a global gene network(hereafter referred to as the basal network ) using a Bayesiannetwork (see Methods) with a transcriptome dataset involvedin the engagement of respiratory virus infection, includingSARS-CoV-2, in several human cell lines (7). To determine abasal network structure, we performed a network estimationusing the neighbor node sampling and repeat algorithm (24),and screened the best algorithm parameters for the targetdataset, as described in our previous study (14). Briefly, thenetwork estimation was run three times independently, andthe subsequent concordance test was performed to ensure therobustness and stability of the estimated basal network. Weconfirmed that the iteration number T = 500 , satisfiesless than 5 % error (Error = 4.0 % for T = 500 , ; error =5.3 % for T = 300 , ). The final basal network comprised127,126 edges and 15,258 nodes, with a threshold of 0.05 andan average degree of 16.7. We used this final basal networkfor the subsequent analyses. Dynamics of host cellular network profiles in differentviral loads of SARS-CoV-2.
To examine the transition of host cellular system dynamics during the increase ofSARS-CoV-2 viral loads, we aimed to characterize thenetworks perturbated by SARS-CoV-2 with two viral loads; alow multiplicity of infection (MOI) of 0.2 and a high MOI of2 in A549 cells. We expected that cells exposed to differentviral loads each present a unique cellular system, and that ourapproach could capture the fluctuation of system dynamicsin whole cellular systems. To obtain differential core genenetworks for each viral load, we followed the multiplesteps using an edge quantification technique, called edgecontribution value (ECv), established in our previous study(14). We first calculated ∆ ECvs following the Eq. (2) where S = SARS-CoV-2 infected and T = mock samples for eachMOI condition (see Methods). The distributions of ∆ ECvshowed that the innate cellular system was more extensivelyperturbated in the cells exposed to the high MOI than thelow MOI (
Fig. 2A ). We next set a threshold of 1 for ∆ ECvand obtained differentially regulated edges (DREs) from thebasal network. The Venn analysis for the ∆ ECv-extractedDREs showed that the number of DREs in the high MOIwas larger than in the low MOI (
Fig. 2B ). Interestingly, thenumber of shared DREs between high- and low- MOIs was42, which was only 6 % of the total number of DREs in bothconditions, indicating that the underlying regulatory systembetween them was not similar. To confirm the biologicalinvolvement of the DREs, we performed canonical pathwayanalysis for the genes contained in the ∆ ECv-obtainedDREs, showing that these genes were associated with somecellular antiviral systems (
Fig. 2C ). These results supportedthat the components of the DREs were biologically relevantto viral infection.To gain a greater insight into the profiles of the DREsfrom the perspective of network topology, we next generatednetworks using a set of all the DREs in the Venn diagram(
Fig. 2B ). These DREs connected mutually and, in turn, et al. | SARS-CoV-2 network analysis low viral load high viral load
B 115 49242
SARS-CoV-2(low viral load) SARS-CoV-2(high viral load)
SARS-CoV-2 SARS-CoV-2 DC Hepatic Fibrosis / Hepatic Stellate Cell ActivationInterferon SignalingGranulocyte Adhesion and DiapedesisAcute Phase Response SignalingAgranulocyte Adhesion and DiapedesisComplement SystemToll-like Receptor SignalingRole of PKR in Interferon Induction and AntiviralResponseNeuroinflammation Signaling PathwayDefferential Regulation of Cytokine Productionin Macrophages and T Helper Cells by IK-17A and IL-17F
Fig. 2.
Dynamics of the SARS-CoV-2-perturbated network for different viral loads in host cells. ( A ) The histograms of ∆ ECv for different SARS-CoV-2 viral loads; a low MOIof 0.2 (blue) and a high MOI of 2 (magenta). The X-axis corresponds to the threshold for each ∆ ECv. The Y-axis shows the number of edges on a log scale. ( B ) The Venndiagram represents the numbers of differentially regulated edges (DREs) for two SARS-CoV-2 viral loads (blue: low MOI, magenta: high MOI) with a threshold of 1.0 for ∆ ECv in Fig. 2A. ( C ) The top 10 terms of canonical pathway analysis for the genes comprising a union set of ∆ ECv-extracted DREs in the Venn diagram analysis (Fig. 2B).( D ) The whole picture for the various sizes of subnetwork fragments is shown. Image of how the ∆ ECv-extracted DREs mutually connected and generated the subnetworks. generated various sizes of subnetwork fragments (
Fig. 2D ).We reasoned that if these fragments represented somebiological significance, these features should be reflected asmodularity, as biologically close functions in cellular systemslink together and shape modules (25). Hence, small-sizedfragments were likely to be less informative, and we focused on the biggest connected component among the variousfragments. The biggest connected component was extractedand the basal edges were additionally mapped on thisnetwork, which established the SARS-CoV-2-perturbatednetwork with 130 nodes and 305 edges (
Fig. 3 ). We foundthat this network consisted of clear three modules linking
Tanaka et al.et al.
Tanaka et al.et al. | SARS-CoV-2 network analysis arXiv | 3 odule 1module 2module 3
Fig. 3.
The SARS-CoV-2-perturbated host cellular network in response to different viral loads in A549 cells (the SARS-CoV-2-perturbated network). The network comprises130 nodes and 305 edges (including 155 basal edges). The colored solid edges represent the SARS-CoV-2-perturbated DREs; high MOI of 2 (magenta), low MOI of 0.2(blue), high MOI ∩ low MOI (purple). Dot edges represent the basal edges (grey). The nodes (green) represents the known drug target genes (Supplementary Table S1).The node size stands for the extent of outdegree. each other. One module (module 1, yellow marked region)was mainly formed of a set of the DREs in the low MOIcondition, and its constituent elements were many interferon(IFN)-stimulated genes (ISGs): IFIs, MXs, OASs, TRIMs,IFTMs, IRFs, and STATs. These highly orchestrated websof various ISGs are induced by transductions of both IFNsignaling and subsequent JAK/STAT signaling (26). Thisevidence strongly suggested that module 1 represents theconsequences of activation of both these signaling pathwaysby the acute antiviral response. Contrary to module 1, theother two modules (module 2, green marked region; andmodule 3, purple marked region) are mainly shaped by aset of the DREs in the high MOI condition. Module 2 andmodule 3 were found to comprise fewer IFN-related genes.While module 2 appeared to be a GAS5-centralized module,module 3 was composed of chemokines (CXCL1, CXCL2,CXCL3, CX3CL1, and CCL20), interleukins (IL6, IL1A,IL1B, and IL32), and colony-stimulating factors (CSF2and CSF3), which are implicated in inflammatory-relatedcytokine signaling followed by the acute activation of IFNand JAK/STAT signaling represented in module 1. Inparticular, the cluster of module 1 and module 3 probablyrepresents the transition of the gene regulatory system in response to SARS-CoV-2 infection. Namely, the cellularsystem perturbated by SARS-CoV-2 gradually switches toinflammatory signaling (module 3) via IFN and JAK/STATsignaling (module 1) as the viral load increased. Thiswas consistent with the clinical observations of COVID-19,and thus may partially explain the process of cytokinestorm syndromes, which is a severe clinical feature ofCOVID-19 (21, 23). We also performed the same analysesamong the four respiratory viruses, and found that module3 was exclusive for SARS-CoV-2 ( Supplementary Fig.S1 ). Collectively, we identified the SARS-CoV-2-perturbatednetwork and its three modules, which reflected distinctivehost cellular functions in response to SARS-CoV-2 infection.
Characterization of the SARS-CoV-2-perturbated net-work at individual sample level.
We next determined howthe signaling represented in the SARS-CoV-2-perturbatednetwork (
Fig. 3 ) changed across samples. To this end,we developed a novel quantitative method, called relativecontribution (RC), to measure the edge contribution at anindividual level. The mathematical definition of RC isdescribed in Methods. Within a set of pairwise parent-childrelations for a certain child, the RC captures how parent et al. | SARS-CoV-2 network analysis ock (low infection) SARS-CoV-2 (low infection)mock (high infection) SARS-CoV-2 (high infection)
Fig. 4.
Sample-specific individual networks around the GAS5-centralized module. The GAS5-centralized module (module 3) in the SARS-CoV-2-perturbated network(presented in Fig. 3) is displayed for four representative samples from each group (mock for SARS-CoV-2-infection (low MOI: 0.2), SARS-CoV-2-infected (low MOI: 0.2), mockfor SARS-CoV-2-infection (high MOI: 2), and SARS-CoV-2-infected (high MOI: 2)). RCs are represented as edge sizes to show individual differences. The node size standsfor the extent of outdegree. genes influence a child gene in response to the pairwiseparent’s mRNA expression, and it can therefore reveal localregulatory changes in response to SARS-CoV-2 infection atan individual sample level. To characterize the individualnetworks, we calculated RCs for 12 samples within fourgroups (mock × × × × Fig. 3 , and selected representative four samples fromeach group. By representing RCs as the sizes of edgewidths, we depicted these four sample-specific individualnetworks (
Supplementary Animation 1 ), and found thatthe vicinity of the GAS5-centralized module (module 2)drastically changed at an RC level (
Fig. 4 ). Interestingly,this module included GAS5, SNHG8, ZFAS1, SNORD52,SNORD58C, SNORA24, and LOC100506548, which en-code non-coding RNA (ncRNA) genes. Given that GAS5appears to function as a hub gene, these results suggested thatthe genes downstream of GAS5 are regulated by differentcellular systems in the mock and SARS-CoV-2 infectionsat a local system level. Especially, our results showed thatGAS5, ZFAS1, and SNHG8 were found to be dominant for SLC9B1 in SARS-CoV-2-infected samples comparedwith the mock, suggesting the regulatory system used wassignificantly different between them (
Fig. 4 ). GAS5 is asingle-strand long ncRNA and one study demonstrated thatthe mRNA expression of GAS5 was elevated in responseto hepatitis C virus infection and that GAS5 impaired virusreplication by the interaction between truncated-GAS5 andHCV NS3 protein in human cells (27). Combined withthis evidence, our results suggest the possibility that thisncRNA-related module 2 may play a novel clear role inSARS-CoV-2 infection.Conversely, of the four individual networks, the two networksfor mocks exhibited no significant change in RC (
Fig. 4and Supplementary Animation 1 ). This was consistentwith the prerequisite experimental designs as the mocksamples are supposed to exhibit the same behaviour, whichfurther supported the validity of our method. Moreover, theRC-highlighted edges displaying even small or no changesexplained that their local regulatory system, presented as aset of pairwise parent-child relationships for one child, didnot change between the individual samples. Collectively, ourdata have demonstrated that we can capture the local systemdifferences in network signaling at an individual level.
Tanaka et al. | SARS-CoV-2 network analysis arXiv | 5 dentification of specific individual networks for pa-tients with COVID-19.
Finally, we aimed to establishCOVID-19 individual networks with a human biopsy dataset(healthy: two samples; COVID-19 positive: two samples)on the basis of the estimated basal network model. Weexpect that the in vivo biopsy dataset potentially providesa more clinically relevant perspective compared with the invitro experiments. Usually, network estimation is impossiblewith such a small number of samples owing to the difficultyin acquisition of the robust network structure, yet ourapproach using the basal network model was capable ofgenerating a context-specific network, even with a fewsamples of a different dataset (
Fig. 1 ). By using the B -splineregression model of the Bayesian network acquired by theestimation of the basal network, we first computed the ECvfor the preprocessed biopsy dataset, despite the absenceof some genes compared with the dataset used for thebasal network estimation. To obtain DREs, we calculated ∆ ECv between healthy (regarded as control) and COVID-19samples following the Eq. (2) where S = healthy ( | S | =2 ) and T = COVID-19 ( | T | = 2 ) (see Methods). The ∆ ECvs were distributed over a broad range, and 4242 DREswere observed at the threshold of 1 for ∆ ECv (
Fig. 5A ).To extract more reliable DREs induced by COVID-19, weset the threshold of 2.3 corresponding approximately to log FC where FC=5, which resulted in 638 DREs. TheseDREs were mapped as networks and the biggest connectedcomponent (167 DREs) was depicted with inclusion of thebasal edges, generating the COVID-19-perturbated network,which comprised 127 nodes and 412 edges (
Fig. 5B ). Thisnetwork is supposedly a representation of the distinctivecellular system in patients with COVID-19. The pathwayanalysis of genes contained in this network showed that theywere involved in the immune and inflammatory response(
Fig. 5C ), supporting the consistency of our establishednetwork with the biological observations in COVID-19.To determine the signatures of the acquired DREs inthe COVID-19-perturbated network, we measured the ECvsimilarity for a set of the 167 DREs across the otherexperimental samples. This result showed that the ECvprofiles in COVID-19 were most similar to the sample ofhigh SARS-CoV-2 viral load in A549 cells in the vitroexperiments (
Supplementary Fig. S2A ), further supportingthat the obtained DREs were associated with SARS-CoV-2infection. We further explored the extent to which theCOVID-19-related 167 DREs overlapped with the Venndiagram established in
Fig. 2B . We observed that a moderatenumber of the DREs were shared by the cell models ofSARS-CoV-2 perturbation (
Fig. 5D ), and then these over-lapped edges were mapped onto the COVID-19-perturbatednetwork (
Supplementary Fig. S2B ). Unlike the networkobservations in
Fig. 3 , we found that both the ISG-relatedwebs (module 1) and subsequent cytokine signaling (module3) involved in inflammatory cascades were coincidentlypresent in the COVID-19-perturbated network, indicatingthat these two modules were continued to be mutuallyactivated in COVID-19. To uncover the differences in the local regulatory system,we next examined the profiles of the COVID-19-perturbatednetwork at an individual level using the RC method (
Fig. 1 ).As the two COVID-19 samples were originally derivedfrom a single patient who tested positive for COVID-19,we calculated RCs for three individuals (healthy 1, healthy2, and COVID-19 patient). The depiction of the RCas the edge sizes eventually led to the establishment ofthe COVID-19 patient-specific network, which is likely toshow how the cellular system changed in the patients withCOVID-19 compared with the healthy controls (
Fig. 6A ).The panel of three individual networks dramatically ex-hibited a great magnitude of differences, showing that thecellular regulatory systems were quite distinctive amongindividuals (
Supplementary Animation 2 ). In comparisonwith the SARS-CoV-2-perturbated network established bythe well-organized in vitro experiments using cell lines(
Supplementary Animation 1 ), this broad range of RCfluctuation for each in vitro sample probably reflects furtherdifferences among individuals. The representative regionswhere local regulatory systems are different among indi-viduals were illustrated in
Fig. 6B . In the zoom 1 region,PELI1 is a parent gene for both TNF and RGS16; these twosignals were dominant in the healthy individuals, but not inthe patient with COVID-19. In contrast, the zoom 2 and3 regions showed that local signals were clearly different,not only between the healthy patients and the patient withCOVID-19, but also even between two healthy individuals.
Discussion
Here, we have presented the host cellular gene networksperturbated by SARS-CoV-2 both in vitro and in vivo byusing our proposed framework for gene network analysis.As the networks we established to be associated withSARS-CoV-2 were generated through RNA-seq data, thesenetworks explain how genes are systematically regulated atthe transcriptome level. Although our approach depends onthe initial network estimation with an experimental datasetand may therefore risk the inclusion of false relationshipsor the exclusion of true relationships, we have succeededin capturing the biologically explainable immune responsesystems in human cells induced by SARS-CoV-2 at the levelof signaling networks.Sensing viruses cause an immune defense system in hostcells, and this induces acute IFN signaling activation fol-lowed by expression of IFNs. These IFNs amplify JAK/STATsignaling to promote the expression of various ISGs andaccelerate subsequent cytokine signaling (26). As illustratedin
Fig. 3 , the mutually interacting module of ISGs (module1) followed by IFN and JAK/STAT signaling was shown tobe an early response to SARS-CoV-2 infection. During theprocess of cells exposed to high SARS-CoV-2 viral loads,the signaling appears to move to the next stage, representedby inflammatory signaling, including the involvement ofvarious cytokines (
Fig. 3 ). The recent reported drug,dexamethasone, could be effective for severe patients withCOVID-19 by suppressing these orchestrated inflammatory et al. | SARS-CoV-2 network analysis
ARS-CoV-2(low viral load) COVID-19
SARS-CoV-2(high viral load)
Differential Regulation of Cytokine Production inMacrophages and T Helper Cells by IL-17A and IL-17FGranulocyte Adhesion and DiapedesisDifferential Regulation of Cytokine Production in Intestinal Epithelial Cells by IL-17A and IL-17FIL-17A Signaling in Gastric CellsActivation of IRF by Cytosolic PatternRecognition ReceptorsAgranulocyte Adhesion and DiapedesisSystemic Lupus ErythematosusIn B Cell Signaling PathwayRole of PKR in Interferon Induction andAntiviral ResponseCommunication between Innate andAdaptive Immune CellsIL-10 Signaling
COVID-19
A BC D
Fig. 5.
The COVID-19-perturbated network analysis. ( A ) The histograms of ∆ ECv for the biopsy dataset. The X-axis corresponds to the threshold for the ∆ ECv. The Y-axisstands for the number of edges with log scale. ( B ) The COVID-19-perturbated network is shown. The network is composed of 127 nodes and 412 edges (including 245basal edges). The colored solid edges represent DREs perturbated by COVID-19 (yellow). The dot edges represent the basal edges (grey). The nodes (green) representthe known drug target genes (Supplementary Table S1). The node size represents the extent of outdegree. ( C ) The top 10 terms of canonical pathway analysis for the genesin the COVID-19-perturbated network. ( D ) The Venn diagram shows the numbers of the ∆ ECv-extracted DREs ( ∆ ECv threshold 2.3) induced by COVID-19 perturbation forbiopsy dataset (yellow) overlapped with the two DREs through the Venn diagram analysis in Fig. 2B. signaling cascades (28). In this network, IL6 was locatedas a hub gene to regulate downstream cascades, includingchemokines and colony-stimulating factors, which are re-ported to be increased in patients with COVID-19 (21). Theweb of chemokines, such as CXCL1, CXCL2, and CXCL3,may represent how SARS-CoV-2-infected cells present asignal to induce leukocyte chemotaxis and infiltration. Thelocalization of ICAM1 in the vicinity of IL6 and chemokinesis supportive of this, as ICAM1 is known to be a scaffoldfor the accumulation of leukocytes at inflammatory sitesand its expression is regulated by cytokines, includingIL6 (29, 30). This tendency was also observed in thenetwork comparison analyses across four respiratory viruses,including SARS-CoV-2 (
Supplementary Fig. S1C ). Thesedata showed that IL6 was not exclusive to SARS-CoV-2, buta universal factor in response to respiratory viral infection,except for influenza A virus. Given that several studies havereported that tocilizumab, an inhibitor of the IL6 receptor,is a potential drug able to suppress the cytokine storm observed in many critical patients with COVID-19 (31, 32),the accumulated evidence strongly suggested that IL6 wouldbe a central regulator of the inflammatory cascade, even froma network perspective. In addition, our network showed thatCSF2 is regulated via various factors, including IL6, whichstrengthens previous reports suggesting that CSF2 may bea promising therapeutic target in combination with IL6 (33,34). Several recent studies have shown that ACE2 plays a keyrole in the process of SARS-CoV-2 infection. SARS-CoV-2enters into host cells via ACE2 (35), and ACE2 was found tobe an ISG in human airway epithelial cells (36). Consideringthat the SARS-CoV-2-perturbated network includes severalISGs (
Fig. 3 ), it can be reasoned that some clues regardingACE2 may be present in this network. In this context, wefound that ACE2 was closely located to this network andwas downstream of TNFRSF9, ATF3, and ARRDC3 viaACHE (data not shown); these are potential candidates forfurther investigation of the relationship between ACE2 andISGs. Thus, our networks provide promising information
Tanaka et al. | SARS-CoV-2 network analysis arXiv | 7 healthy 1 healthy 2COVID-19 z oo m z oo m z oo m zoom 1zoom 3 zoom 2 A Fig. 6.
Establishment of the COVID-19 patient-specific individual network. ( A ) The COVID-19 patient-specific network with RCs represented by edge sizes. The networkcomprises 127 nodes and 412 edges (including 245 basal edges). The colored solid edges represent DREs; SARS-CoV-2 (high MOI: 2) ∩ COVID-19-perturbated (magenta),SARS-CoV-2 (lowMOI: 0.2) ∩ COVID-19-perturbated (blue), SARS-CoV-2 (high MOI: 2) ∩ SARS-CoV-2 (low MOI: 0.2) ∩ COVID-19-perturbated (purple), COVID-19-perturbated exclusive edges (yellow). The dot edges represent the basal edges (grey). The nodes (green) represents the known drug target genes (Supplementary TableS1). The node size stands for the extent of outdegree. ( B ) Zoomed regions indicated in Fig. 6A for three individuals (healthy 1, healthy 2, and the patient with COVID-19).RCs are represented as edge sizes to show individual differences.8 | arXiv Tanaka et al. | SARS-CoV-2 network analysis o elucidate SARS-CoV-2 profiles from a broad biologicalperspective.Our second noteworthy outcome in this study was thatwe succeeded in the characterization of sample-specificindividual networks by introducing the new edge-quantitativetechnique of RC. In particular, although it is impossible toestimate a network with a small number of samples, suchas the four biopsy samples used in our case, the basalnetwork model that was already obtained through the analysisof the in vitro dataset with both RC and ECv methodshas led to establishment of the COVID-19 patient-specificindividual network. This process represents how we extrap-olate between in vitro and in vivo experiments. Differentsamples each exhibit a unique regulatory profile, especiallyin actual individuals (such as those obtained from biopsy)rather than well-controlled in vitro samples ( SupplementaryAnimation 1 and 2 ). These results probably reflect amore realistic clinical situation and increase the importanceof making the most of effective utilization of a biopsydataset. In the current outbreak of COVID-19, we need tolook into both biological and clinical aspects for exploringCOVID-19 therapy. The individual networks regardingCOVID-19 shows the extent to which individuals possesstheir own network, which ultimately links to the necessityof personalized treatment. Therefore, our efforts are apotential contribution to the emerging field of personalizedmedicine. The biopsy dataset that we used was not sufficientto allow interpretation of the comprehensive informationthrough individual networks in patients with COVID-19, as itcontained fewer COVID-19 samples. More clinical samplesfrom patients with COVID-19 can lead to the determinationof key regulatory systems at a clinical level. We hope that ourpanel of network analyses will be of help to the SARS-CoV-2research field and to establishment further treatments forCOVID-19.
Methods
Global gene network estimation and core networkextraction.
In general, methods for gene network analysisare intended for the extraction of gene-to-gene regulatoryrelationships universally underlying given transcriptomedatasets. Unlike commonly existing gene networks, werecently developed a method to extract sample-specific genenetworks. Our method first estimates a global gene network,called the basal network , that included all the genes in adataset using a Bayesian network with B -spline nonpara-metric regression (24, 37). Bayesian network estimationis capable of capturing global cause-and-effect relationshipsamong gene expression, rather than extracting locally co-regulated genes, such as co-expression correlation networks.This is realized by finding the conditional independenciesamong variables. In gene network analysis using a Bayesiannetwork, gene expression is regarded as an observed samplefrom the random variables that correspond to genes ortranscriptomes in a cell.Let X , . . . , X p be p variables of genes. In a Bayesiannetwork, we consider the joint density of p variables and assume that it is decomposed as the product of localconditional densities, such that f ( X , . . . , X p ; θ G ) = p Y j =1 f ( X j | X j , . . . , X j qj ; θ j ) , where j , . . . , j q j are indices of q j dependent variables of the j -th variable. This decomposition can be represented as adirected acyclic graph (network) and variables X j , . . . , X j qj are connected as parents or inputs of the j -th variable in thenetwork.The B -spline nonparametric regression version of theBayesian network models gene-to-gene expression relation-ships as mathematical equations using B -spline curves, suchthat x j = m ( j )1 ( x j ) + · · · + m ( j ) q j ( x j qj ) + ε j , (1) where x j represents an expression value of the j -th gene, ε j is the error term normally distributed with mean and variance σ j , and m ( j ) k ( x ) = P Mm =1 γ jkm b jkm ( x ) isa regression function using M B -spline basis functions b jkm ( · ) , and their coefficients γ jkm .The structure search of the Bayesian network corresponds tofinding the decomposition of the joint density. This is im-plemented by the maximization of the posterior probability,such that p ( G | X ) ∝ π ( G ) Z n Y i =1 f ( x i , · · · , x ip ; θ G ) π ( θ G | λ ) dθ G , where X is an n -by- p input matrix whose element x ij corresponds to an expression value of the i -th sample for the j -th gene, G represents the network structure, π ( G ) is theprior probability of G , θ G is the parameter vector of the localconditional densities, π ( θ G | λ ) is the prior distribution of θ G ,and λ is the hyperparameter vector. The difficulty in genenetwork estimation by the Bayesian network is the step ofstructure learning for large networks, because this is knownto be an NP-hard problem; namely, an exponential increasein search space to the number of variables. We have used theneighbor node sampling and repeat algorithm that realizesthe estimation of the large Bayesian network structure (24).It repeats the subnetwork estimation many times in parallelfor the sampled variable sets by random walking, and thus itcan estimate the large network within a realistic time.After the basal network estimation, we then quantifiedevery single edge with respect to a certain sample in termsof the system-level usage of the edge with the estimatedmathematical model. Tanaka et al. (14) defined an edgecontribution value (ECv) of edge j k → j as ECv ( u ) ( j k → j ) = m ( j ) k ( x ( u ) j k ) where x ( u ) j k represented the expression value of the j k -thgene at a certain sample denoted by u , and m ( j ) k ( · ) was aregression function defined in Eq. (1). Note that sample u did not necessarily have to be a single sample for use in the Tanaka et al. | SARS-CoV-2 network analysis arXiv | 9 etwork estimation. They proved that ECv can be used forthe quantification of edge j k → j with respect to a givensample. To extract sample-specific networks, they consideredthe differences of ECvs between two different conditions ofsamples, similar to extracting differentially expressed genesby comparing control and perturbated expressions. Theydefined ∆ ECv as ∆ ECv ( S,T ) ( j k → j ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) | S | X s ∈ S ECv ( s ) ( j k → j ) − | T | X t ∈ T ECv ( t ) ( j k → j ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , (2) where S and T are sets of samples observed in the particularconditions, respectively, in which | S | ≥ and | T | ≥ . Notethat in the case of | S | = | T | = 1 , this allows considerationof the differences between just two samples, e.g. controland perturbated samples. In general, we assume multiplereplicated samples or a set of individual samples for both S and T . By extracting edges and their connected nodes with ∆ ECvs greater or equal to a certain threshold, we can definethe sample- or condition-specific core network from the basalnetwork. In this study, S and T were sets of infected andcontrol (mock) replicated samples, respectively. As the targetdataset includes control samples for a particular series ofexperiments, we can extract certain core networks from themby calculating ∆ ECv for the series of experiments using theircorresponding control samples. For example, we extracteda SARS-CoV-2-perturbated core network by calculating ∆ ECvs for SARS-CoV-2-infected and their correspondingmock-triplicate samples. As performed in the previous study,we generally employed ∆ ECv ≥ . for the threshold of thecore network extraction. This approximately corresponds to2-fold changes in differentially expressed genes for extractedgenes. Thus, we considered the extracted networks, includingedges and nodes, which showed the significant activation ofregulatory systems by the infection. Proposed relative contribution of edges for character- izing individual networks.
This ECv development alloweda new solution for gene network analyses. In the previousstudy (14), we succeeded in characterizing network profilesby calculating ECvs for edges in a ∆ ECv-extracted corenetwork with respect to many samples from patients withcancer. The conventional clustering onto these calculatedECvs led to the identification of prognosis-related subgroups.Thus, we demonstrated that the differences and similarities inedge profiles of the network could be captured as patternsof ECvs. Despite the high availability of ECvs, it is stillimpossible to directly compare ECvs between individualsamples, because ECvs have different sizes depending on theestimated pairwise edge and the sample. The normalizationof ECvs across samples is inappropriate for our purposeowing to the mutual dependency of the individual networkon each sample. Thus, it has not been possible to highlightthe differences in regulatory systems at an individual level. For these reasons, ECvs are not appropriate to the analysisof individual networks. To overcome the drawbacks, wehave proposed a novel method, relative contribution (RC),to quantify edges with respect to individual samples usingthe estimated gene network model. We hypothesized thatthe differences of individual samples in terms of the cellularsystem can be observed as the differences in ratios of thecontributions of edges connecting to a certain node in thenetwork. Edges with different samples need to be describedas differently weighted edges according to ratios of effectsbetween parents that regulate or are connected to a certaingene. In addition, the quantification of a network with asingle sample needs to be independent from other samplesand their distributions. To realize this, we define the relativecontribution of an edge with respect to a sample as RC ( u ) ( j k → j ) = | ECv ( u ) ( j k → j ) | max ≤ k ≤ q j | ECv ( u ) ( j k → j ) | , where u represents a certain sample ( < RC ≤ ). That is, anRC of the edge is a relative strength of the contribution of theedge to the maximum strength among the parents connectingto the same child node. The reason why an RC is not dividedby the sum of the ECvs is that the range of RCs does notshrink depending on the number of parents of the child node.A drawback of RCs is that, if the ratio of ECvs of the parentsis not changed, the changes of parent values do not affect theRCs. However, RCs of their downstreams will be affected bysuch changes. Therefore, this drawback is not problematicin terms of the specification of differences in individualnetworks. Note that, similar to ECvs, sample u does notnecessarily need to be a single sample used for the networkestimation. As illustrated in Results, we have shown that RCscan be used to analyze individual networks, even if we havea single sample, or only a few samples, of gene expressiondata, as long as a basal network can be estimated from otherdatasets. RC therefore offers a significant enhancement toour framework for gene network analysis. Our data havedemonstrated that the framework, through an integration ofthe three key pieces – Bayesian network estimation, ECv,and RC – provides a powerful data-driven solution to seekbiological phenomena through cellular systems ranging froma global level to an individual level. Our proposed frameworkis mathematically illustrated in Supplementary Fig. S3 . Dataset.
The transcriptome dataset GSE147507 was down-loaded from NCBI Gene Expression Omnibus (7). Thesamples were infected with respiratory viruses, includingSARS-CoV-2, and biological replicates were performed. Wefirst selected samples exclusive for human RNA-seq with78 samples. Among the samples, four samples of the invivo experiment (biopsy) data were pre-eliminated. The log -transformed dataset was filtered to remove genes with amean percentile lower than 30 % , resulting in 74 samples and15,258 genes. This preprocessed dataset of the 74 ×
10 | arXiv Tanaka et al. | SARS-CoV-2 network analysis nd two COVID-19-positive samples). The RPM (reads permillion) normalized biopsy dataset was log -transformed andgenes with at least one zero value were removed to obtainmore reliable data. The two technical replicate samples forCOVID-19 were averaged for RC calculation. Following thispreprocessing, the input dataset for RC calculation finallycomprised a 3 × ∆ ECv calculations in this study were: SARS-CoV-2in A549 cells (MOI of 0.2/2 for 24 hr, n=3) and thecorresponding mock (n=3); SARS-CoV-2 in normal humanbronchial epithelial (NHBE) cells (MOI of 2 for 24 hr, n=3)and the corresponding mock (n=3); SARS-CoV-2 in Calu-3cells (MOI of 2 for 24 hr, n=3) and the corresponding mock(n=3); human respiratory syncytial virus (RSV) in A549 cells(MOI of 2 for 24 hr, n=3) and the corresponding mock (n=3);human parainfluenza virus 3 (HPIV3) in A549 cells (MOIof 2 for 24 hr, n=3) and the corresponding mock (n=3);influenza A virus (IAV) in A549 cells (MOI of 5 for 9 hr,n=2) and the corresponding mock (n=2); COVID-19 (n=2)and healthy (n=2).
Pathway analysis.
The canonical pathway analysis wasperformed through the use of Ingenuity Pathway Analysissoftware (38).
Network analysis and visualization.
The network visu-alization and the network analysis were performed usingCytoscape (version 3.7.2 and 3.8.0) (39). The genes forknown drug targets were acquired from IPA knowledgedatabase (38) and the representative drugs were listed in
Supplementary Table S1 . Computer environments.
All the computation for thenetwork estimation and the ECv calculations in this studywere performed by the SHIROKANE supercomputer sys-tem (Shirokane5) at Human Genome Center, the Instituteof Medical Science, the University of Tokyo, where thecomputation nodes were equipped with dual Intel Xeon Gold6154 3.0GHz CPUs and 192GB memory per node.
Data availability.
All the network files generated inthis study are provided on the supplementary data(http://ytlab.jp/suppl/tanaka_arxiv2020/index.html). Theprogram for network estimation is freely available forSHIROKANE users. The ECv/RC calculation program isavailable for non-commercial academic users upon request.
ACKNOWLEDGEMENTS
We thank M. Ikeguchi for technical support. The super-computing resourcewas provided by Human Genome Center, the Institute of Medical Science, theUniversity of Tokyo. This work was supported by RIKEN Junior Research AssociateProgram. This study was performed using funds provided by PRISM (Public/PrivateR & D Investment Strategic Expansion PrograM) in Japan. This paper format wasgenerated through self-modification of the original template designed by RicardoHenriques.
AUTHOR CONTRIBUTIONS
Y.T. (Yoshihisa Tanaka), Y.T.* (Yoshinori Tamada), and Y.O. conceived theexperiments, Y.T., K.H., and M.A.N. conducted the experiments, K.H. visualizedthe networks, Y.T., Y.T.*, and Y.O. analyzed the results, Y.T. and Y.T.* wrote themanuscript, M.A.N., F.Y., and Y.O. reviewed and edited the manuscript.
COMPETING FINANCIAL INTERESTS
Y. Tamada and Y. Okuno have a patent application on the individual profiling andthe method for extraction of a sample-specific network used in this study through the technology licensing organization in Kyoto University. Other authors declare noconflict of interest.
Bibliography
1. Na Zhu, Dingyu Zhang, Wenling Wang, Xingwang Li, Bo Yang, Jingdong Song, Xiang Zhao,Baoying Huang, Weifeng Shi, Roujian Lu, Peihua Niu, Faxian Zhan, Xuejun Ma, DayanWang, Wenbo Xu, Guizhen Wu, George F Gao, Wenjie Tan, and China Novel CoronavirusInvestigating and Research Team. A novel coronavirus from patients with pneumonia inchina, 2019.
N. Engl. J. Med. , 382(8):727–733, February 2020. ISSN 0028-4793, 1533-4406. doi: .2. Fan Wu, Su Zhao, Bin Yu, Yan-Mei Chen, Wen Wang, Zhi-Gang Song, Yi Hu, Zhao-WuTao, Jun-Hua Tian, Yuan-Yuan Pei, Ming-Li Yuan, Yu-Ling Zhang, Fa-Hui Dai, Yi Liu, Qi-Min Wang, Jiao-Jiao Zheng, Lin Xu, Edward C Holmes, and Yong-Zhen Zhang. A newcoronavirus associated with human respiratory disease in china.
Nature , 579(7798):265–269, March 2020. ISSN 0028-0836, 1476-4687. doi: .3. Ensheng Dong, Hongru Du, and Lauren Gardner. An interactive web-based dashboard totrack COVID-19 in real time.
Lancet Infect. Dis. , 20(5):533–534, May 2020. ISSN 1473-3099, 1474-4457. doi: .4. Joel Hellewell, Sam Abbott, Amy Gimma, Nikos I Bosse, Christopher I Jarvis, Timothy WRussell, James D Munday, Adam J Kucharski, W John Edmunds, Centre for theMathematical Modelling of Infectious Diseases COVID-19 Working Group, Sebastian Funk,and Rosalind M Eggo. Feasibility of controlling COVID-19 outbreaks by isolation of casesand contacts.
Lancet Glob Health , 8(4):e488–e496, April 2020. ISSN 2214-109X. doi: .5. David E Gordon, Gwendolyn M Jang, Mehdi Bouhaddou, Jiewei Xu, Kirsten Obernier,Kris M White, Matthew J O’Meara, Veronica V Rezelj, Jeffrey Z Guo, Danielle L Swaney,Tia A Tummino, Ruth Hüttenhain, Robyn M Kaake, Alicia L Richards, Beril Tutuncuoglu,Helene Foussard, Jyoti Batra, Kelsey Haas, Maya Modak, Minkyu Kim, Paige Haas,Benjamin J Polacco, Hannes Braberg, Jacqueline M Fabius, Manon Eckhardt, MargaretSoucheray, Melanie J Bennett, Merve Cakir, Michael J McGregor, Qiongyu Li, Bjoern Meyer,Ferdinand Roesch, Thomas Vallet, Alice Mac Kain, Lisa Miorin, Elena Moreno, Zun Zar ChiNaing, Yuan Zhou, Shiming Peng, Ying Shi, Ziyang Zhang, Wenqi Shen, Ilsa T Kirby,James E Melnyk, John S Chorba, Kevin Lou, Shizhong A Dai, Inigo Barrio-Hernandez,Danish Memon, Claudia Hernandez-Armenta, Jiankun Lyu, Christopher J P Mathy, TinaPerica, Kala Bharath Pilla, Sai J Ganesan, Daniel J Saltzberg, Ramachandran Rakesh,Xi Liu, Sara B Rosenthal, Lorenzo Calviello, Srivats Venkataramanan, Jose Liboy-Lugo,Yizhu Lin, Xi-Ping Huang, Yongfeng Liu, Stephanie A Wankowicz, Markus Bohn, MalihehSafari, Fatima S Ugur, Cassandra Koh, Nastaran Sadat Savar, Quang Dinh Tran, DjoshkunShengjuler, Sabrina J Fletcher, Michael C O’Neal, Yiming Cai, Jason C J Chang, David JBroadhurst, Saker Klippsten, Phillip P Sharp, Nicole A Wenzell, Duygu Kuzuoglu-Ozturk,Hao-Yuan Wang, Raphael Trenker, Janet M Young, Devin A Cavero, Joseph Hiatt,Theodore L Roth, Ujjwal Rathore, Advait Subramanian, Julia Noack, Mathieu Hubert,Robert M Stroud, Alan D Frankel, Oren S Rosenberg, Kliment A Verba, David A Agard,Melanie Ott, Michael Emerman, Natalia Jura, Mark von Zastrow, Eric Verdin, Alan Ashworth,Olivier Schwartz, Christophe d’Enfert, Shaeri Mukherjee, Matt Jacobson, Harmit S Malik,Danica G Fujimori, Trey Ideker, Charles S Craik, Stephen N Floor, James S Fraser,John D Gross, Andrej Sali, Bryan L Roth, Davide Ruggero, Jack Taunton, Tanja Kortemme,Pedro Beltrao, Marco Vignuzzi, Adolfo García-Sastre, Kevan M Shokat, Brian K Shoichet,and Nevan J Krogan. A SARS-CoV-2 protein interaction map reveals targets for drugrepurposing.
Nature , 583(7816):459–468, July 2020. ISSN 0028-0836, 1476-4687. doi: .6. Denisa Bojkova, Kevin Klann, Benjamin Koch, Marek Widera, David Krause, Sandra Ciesek,Jindrich Cinatl, and Christian Münch. Proteomics of SARS-CoV-2-infected host cells revealstherapy targets.
Nature , 583(7816):469–472, July 2020. ISSN 0028-0836, 1476-4687. doi: .7. Daniel Blanco-Melo, Benjamin E Nilsson-Payant, Wen-Chun Liu, Skyler Uhl, DaisyHoagland, Rasmus Møller, Tristan X Jordan, Kohei Oishi, Maryline Panis, David Sachs,Taia T Wang, Robert E Schwartz, Jean K Lim, Randy A Albrecht, and Benjamin R tenOever.Imbalanced host response to SARS-CoV-2 drives development of COVID-19.
Cell , 181(5):1036–1045.e9, May 2020. ISSN 0092-8674, 1097-4172. doi: .8. Jingwen Yan, Shannon L Risacher, Li Shen, and Andrew J Saykin. Network approachesto systems biology analysis of complex disease: integrative methods for multi-omics data.
Brief. Bioinform. , 19(6):1370–1381, November 2018. ISSN 1467-5463, 1477-4054. doi: .9. Maurizio Recanatini and Chiara Cabrelle. Drug research meets network science: Where arewe?
J. Med. Chem. , May 2020. ISSN 0022-2623, 1520-4804. doi: .10. Pietro H Guzzi, Daniele Mercatelli, Carmine Ceraolo, and Federico M Giorgi. Masterregulator analysis of the SARS-CoV-2/Human interactome.
J. Clin. Med. Res. , 9(4), April2020. ISSN 1918-3003, 2077-0383. doi: .11. Yadi Zhou, Yuan Hou, Jiayu Shen, Yin Huang, William Martin, and Feixiong Cheng.Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2.
CellDiscov , 6:14, March 2020. ISSN 2056-5968. doi: .12. Deisy Morselli Gysi, Ítalo Do Valle, Marinka Zitnik, Asher Ameli, Xiao Gan, Onur Varol, HeliaSanchez, Rebecca Marlene Baron, Dina Ghiassian, Joseph Loscalzo, and Albert-LászlóBarabási. Network medicine framework for identifying drug repurposing opportunities forCOVID-19.
ArXiv , April 2020. ISSN 2331-8422.13. Paolo Fagone, Rosella Ciurleo, Salvo Danilo Lombardo, Carmelo Iacobello, Concetta IleniaPalermo, Yehuda Shoenfeld, Klaus Bendtzen, Placido Bramanti, and Ferdinando Nicoletti.Transcriptional landscape of SARS-CoV-2 infection dismantles pathogenic pathwaysactivated by the virus, proposes unique sex-specific differences and predicts tailoredtherapeutic strategies.
Autoimmun. Rev. , page 102571, May 2020. ISSN 1568-9972,1873-0183. doi: . Tanaka et al. | SARS-CoV-2 network analysis arXiv | 11
4. Yoshihisa Tanaka, Yoshinori Tamada, Marie Ikeguchi, Fumiyoshi Yamashita, and YasushiOkuno. System-Based differential gene network analysis for characterizing a Sample-Specific subnetwork.
Biomolecules , 10(2), February 2020. ISSN 2218-273X. doi: .15. Adam A Margolin, Ilya Nemenman, Katia Basso, Chris Wiggins, Gustavo Stolovitzky,Riccardo Dalla Favera, and Andrea Califano. ARACNE: an algorithm for the reconstructionof gene regulatory networks in a mammalian cellular context.
BMC Bioinformatics , 7 Suppl1:S7, March 2006. ISSN 1471-2105. doi: .16. Hiromitsu Araki, Yoshinori Tamada, Seiya Imoto, Ben Dunmore, Deborah Sanders, SallyHumphrey, Masao Nagasaki, Atsushi Doi, Yukiko Nakanishi, Kaori Yasuda, Yuki Tomiyasu,Kousuke Tashiro, Cristin Print, D Stephen Charnock-Jones, Satoru Kuhara, and SatoruMiyano. Analysis of PPAR α -dependent and PPAR α -independent transcript regulationfollowing fenofibrate treatment of human endothelial cells. Angiogenesis , 12(3):221–229,September 2009. ISSN 0969-6970, 1573-7209. doi: .17. Li Wang, Daniel G Hurley, Wendy Watkins, Hiromitsu Araki, Yoshinori Tamada, AnitaMuthukaruppan, Louis Ranjard, Eliane Derkac, Seiya Imoto, Satoru Miyano, Edmund JCrampin, and Cristin G Print. Cell cycle gene networks are associated with melanomaprognosis.
PLoS One , 7(4):e34247, April 2012. ISSN 1932-6203. doi: .18. Muna Affara, Debbie Sanders, Hiromitsu Araki, Yoshinori Tamada, Benjamin J Dunmore,Sally Humphreys, Seiya Imoto, Christopher Savoie, Satoru Miyano, Satoru Kuhara, DavidJeffries, Cristin Print, and D Stephen Charnock-Jones. Vasohibin-1 is identified as a master-regulator of endothelial cell apoptosis using gene network analysis.
BMC Genomics , 14:23,January 2013. ISSN 1471-2164. doi: .19. Arun J Singh, Stephen A Ramsey, Theresa M Filtz, and Chrissa Kioussi. Differential generegulatory networks in development and disease.
Cell. Mol. Life Sci. , 75(6):1013–1025,March 2018. ISSN 1420-682X, 1420-9071. doi: .20. Koon-Kiu Yan, Daifeng Wang, Anurag Sethi, Paul Muir, Robert Kitchen, Chao Cheng, andMark Gerstein. Cross-Disciplinary network comparison: Matchmaking between hairballs.
Cell Syst , 2(3):147–157, March 2016. ISSN 2405-4712. doi: .21. Chaolin Huang, Yeming Wang, Xingwang Li, Lili Ren, Jianping Zhao, Yi Hu, Li Zhang,Guohui Fan, Jiuyang Xu, Xiaoying Gu, Zhenshun Cheng, Ting Yu, Jiaan Xia, YuanWei, Wenjuan Wu, Xuelei Xie, Wen Yin, Hui Li, Min Liu, Yan Xiao, Hong Gao, Li Guo,Jungang Xie, Guangfa Wang, Rongmeng Jiang, Zhancheng Gao, Qi Jin, Jianwei Wang,and Bin Cao. Clinical features of patients infected with 2019 novel coronavirus in wuhan,china.
Lancet , 395(10223):497–506, February 2020. ISSN 0140-6736, 1474-547X. doi: .22. Wei-Jie Guan, Zheng-Yi Ni, Yu Hu, Wen-Hua Liang, Chun-Quan Ou, Jian-Xing He, Lei Liu,Hong Shan, Chun-Liang Lei, David S C Hui, Bin Du, Lan-Juan Li, Guang Zeng, Kwok-YungYuen, Ru-Chong Chen, Chun-Li Tang, Tao Wang, Ping-Yan Chen, Jie Xiang, Shi-Yue Li,Jin-Lin Wang, Zi-Jing Liang, Yi-Xiang Peng, Li Wei, Yong Liu, Ya-Hua Hu, Peng Peng,Jian-Ming Wang, Ji-Yang Liu, Zhong Chen, Gang Li, Zhi-Jian Zheng, Shao-Qin Qiu, JieLuo, Chang-Jiang Ye, Shao-Yong Zhu, Nan-Shan Zhong, and China Medical TreatmentExpert Group for Covid-19. Clinical characteristics of coronavirus disease 2019 in china.
N.Engl. J. Med. , 382(18):1708–1720, April 2020. ISSN 0028-4793, 1533-4406. doi: .23. Puja Mehta, Daniel F McAuley, Michael Brown, Emilie Sanchez, Rachel S Tattersall,Jessica J Manson, and HLH Across Speciality Collaboration, UK. COVID-19: considercytokine storm syndromes and immunosuppression.
Lancet , 395(10229):1033–1034,March 2020. ISSN 0140-6736, 1474-547X. doi: .24. Yoshinori Tamada, Seiya Imoto, Hiromitsu Araki, Masao Nagasaki, Cristin Print, D StephenCharnock-Jones, and Satoru Miyano. Estimating genome-wide gene networks usingnonparametric bayesian network models on massively parallel computers.
IEEE/ACMTrans. Comput. Biol. Bioinform. , 8(3):683–697, May 2011. ISSN 1545-5963, 1557-9964.doi: .25. Albert-László Barabási and Zoltán N Oltvai. Network biology: understanding the cell’sfunctional organization.
Nat. Rev. Genet. , 5(2):101–113, February 2004. ISSN 1471-0056.doi: .26. Emily V Mesev, Robert A LeDesma, and Alexander Ploss. Decoding type I and III interferonsignalling during viral infection.
Nat Microbiol , 4(6):914–924, June 2019. ISSN 2058-5276.doi: .27. Xijing Qian, Chen Xu, Ping Zhao, and Zhongtian Qi. Long non-coding RNA GAS5 inhibitedhepatitis C virus replication by binding viral NS3 protein.
Virology , 492:155–165, May 2016.ISSN 0042-6822, 1096-0341. doi: .28. RECOVERY Collaborative Group, Peter Horby, Wei Shen Lim, Jonathan R Emberson,Marion Mafham, Jennifer L Bell, Louise Linsell, Natalie Staplin, Christopher Brightling,Andrew Ustianowski, Einas Elmahi, Benjamin Prudon, Christopher Green, Timothy Felton,David Chadwick, Kanchan Rege, Christopher Fegan, Lucy C Chappell, Saul N Faust,Thomas Jaki, Katie Jeffery, Alan Montgomery, Kathryn Rowan, Edmund Juszczak,J Kenneth Baillie, Richard Haynes, and Martin J Landray. Dexamethasone in hospitalizedpatients with covid-19 - preliminary report.
N. Engl. J. Med. , July 2020. ISSN 0028-4793,1533-4406. doi: .29. A Meager. Cytokine regulation of cellular adhesion molecule expression in inflammation.
Cytokine Growth Factor Rev. , 10(1):27–39, March 1999. ISSN 1359-6101. doi: .30. Lauro Velazquez-Salinas, Antonio Verdugo-Rodriguez, Luis L Rodriguez, and Manuel VBorca. The role of interleukin 6 during viral infections.
Front. Microbiol. , 10:1057, May 2019.ISSN 1664-302X. doi: .31. Binqing Fu, Xiaoling Xu, and Haiming Wei. Why tocilizumab could be an effective treatmentfor severe COVID-19?
J. Transl. Med. , 18(1):164, April 2020. ISSN 1479-5876. doi: .32. Xiaoling Xu, Mingfeng Han, Tiantian Li, Wei Sun, Dongsheng Wang, Binqing Fu, YonggangZhou, Xiaohu Zheng, Yun Yang, Xiuyong Li, Xiaohua Zhang, Aijun Pan, and Haiming Wei.Effective treatment of severe COVID-19 patients with tocilizumab.
Proc. Natl. Acad. Sci. U.S. A. , 117(20):10970–10975, May 2020. ISSN 0027-8424, 1091-6490. doi: . 33. Yonggang Zhou, Binqing Fu, Xiaohu Zheng, Dongsheng Wang, Changcheng Zhao, YingjieQi, Rui Sun, Zhigang Tian, Xiaoling Xu, and Haiming Wei. Pathogenic t-cells andinflammatory monocytes incite inflammatory storms in severe COVID-19 patients.
Natl SciRev , 7(6):998–1002, June 2020. ISSN 2095-5138. doi: .34. Frederick M Lang, Kevin M-C Lee, John R Teijaro, Burkhard Becher, and John AHamilton. GM-CSF-based treatments in COVID-19: reconciling opposing therapeuticapproaches.
Nat. Rev. Immunol. , June 2020. ISSN 1474-1733, 1474-1741. doi: .35. Markus Hoffmann, Hannah Kleine-Weber, Simon Schroeder, Nadine Krüger, Tanja Herrler,Sandra Erichsen, Tobias S Schiergens, Georg Herrler, Nai-Huei Wu, Andreas Nitsche,Marcel A Müller, Christian Drosten, and Stefan Pöhlmann. SARS-CoV-2 cell entry dependson ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor.
Cell , 181(2):271–280.e8, April 2020. ISSN 0092-8674, 1097-4172. doi: .36. Carly G K Ziegler, Samuel J Allon, Sarah K Nyquist, Ian M Mbano, Vincent N Miao,Constantine N Tzouanas, Yuming Cao, Ashraf S Yousif, Julia Bals, Blake M Hauser, JaredFeldman, Christoph Muus, Marc H Wadsworth, 2nd, Samuel W Kazer, Travis K Hughes,Benjamin Doran, G James Gatter, Marko Vukovic, Faith Taliaferro, Benjamin E Mead, ZhiruGuo, Jennifer P Wang, Delphine Gras, Magali Plaisant, Meshal Ansari, Ilias Angelidis,Heiko Adler, Jennifer M S Sucre, Chase J Taylor, Brian Lin, Avinash Waghray, VanessaMitsialis, Daniel F Dwyer, Kathleen M Buchheit, Joshua A Boyce, Nora A Barrett, Tanya MLaidlaw, Shaina L Carroll, Lucrezia Colonna, Victor Tkachev, Christopher W Peterson,Alison Yu, Hengqi Betty Zheng, Hannah P Gideon, Caylin G Winchell, Philana Ling Lin,Colin D Bingle, Scott B Snapper, Jonathan A Kropski, Fabian J Theis, Herbert B Schiller,Laure-Emmanuelle Zaragosi, Pascal Barbry, Alasdair Leslie, Hans-Peter Kiem, Joanne LFlynn, Sarah M Fortune, Bonnie Berger, Robert W Finberg, Leslie S Kean, Manuel Garber,Aaron G Schmidt, Daniel Lingwood, Alex K Shalek, Jose Ordovas-Montanes, HCA LungBiological Network. Electronic address: [email protected], and HCA LungBiological Network. SARS-CoV-2 receptor ACE2 is an Interferon-Stimulated gene in humanairway epithelial cells and is detected in specific cell subsets across tissues.
Cell , 181(5):1016–1035.e19, May 2020. ISSN 0092-8674, 1097-4172. doi: .37. Seiya Imoto, Takao Goto, and Satoru Miyano. Estimation of genetic networks and functionalstructures between genes by using bayesian networks and nonparametric regression.
Pac.Symp. Biocomput. , pages 175–186, 2002. ISSN 2335-6936, 2335-6928.38. Andreas Krämer, Jeff Green, Jack Pollard, Jr, and Stuart Tugendreich. Causal analysisapproaches in ingenuity pathway analysis.
Bioinformatics , 30(4):523–530, February 2014.ISSN 1367-4803, 1367-4811. doi: .39. Paul Shannon, Andrew Markiel, Owen Ozier, Nitin S Baliga, Jonathan T Wang, DanielRamage, Nada Amin, Benno Schwikowski, and Trey Ideker. Cytoscape: a softwareenvironment for integrated models of biomolecular interaction networks.
Genome Res. ,13(11):2498–2504, November 2003. ISSN 1088-9051. doi: .
12 | arXiv Tanaka et al. | SARS-CoV-2 network analysis
AVSARS-CoV-2RSVHPIV3
A BCD
Hepatic Fibrosis / Hepatic Stellate Cell ActivationAcute Phase Response SignalingHepatic Fibrosis Signaling PathwayGranulocyte Adhesion and DiapedesisRole of Macrophages, Fibroblasts and Endothelial Cellsin Rheumatoid ArthritisOsteoarthritis PathwayDifferential Regulation of Cytokine Productionin Macrophages and T Helper Cells by IL-17A and IL-17FToll-like Receptor SignalingTNFR2 SignalingNeuroinflammation Signaling Pathway
Synaptogenesis Signaling PathwayIron homeostasis signaling pathwaySTAT3 PathwayGlycine Betaine DegradationCholesterol Biosynthesis ICholesterol Biosynthesis II(via 24,25-dihydrolanosterol)Cholesterol Biosynthesis III(via Desmosterol)Asparagine Biosynthesis ILanosterol BiosynthesisMechanisms of Viral Exit from Host Cells
Supplementary Fig. S1 . Network comparison analyses across four respiratory viruses. ( A ) The histograms of ∆ ECv for each respiratory virus as indicated: SARS-CoV-2(MOI: 2), HPIV3, IAV, and RSV. ∆ ECvs were calculated following the Eq. (2) where S = virus-infected and T = corresponding mock samples for each virus (see Methods).The X-axis corresponds to the threshold for each ∆ ECv. The Y-axis stands for the number of edges on a log scale. ( B ) The Venn diagram represents the numbers of ∆ ECv-extracted edges for all respiratory viruses with a ∆ ECv threshold of 1.0. ( C ) The respiratory viruses-shared network comprised of 62 nodes and 116 edges (including53 basal edges). The colored solid edges represent DREs; SARS-CoV-2 ∩ IAV ∩ HPIV3 ∩ RSV (purple), SARS-CoV-2 ∩ RSV ∩ HPIV3 (red), IAV ∩ RSV ∩ HPIV3 (blue).The top 10 terms of canonical pathway analysis for the genes of ∆ ECv-extracted DREs shared by at least three viruses in the Venn diagram (Supplementary Fig. S1B). ( D )SARS-CoV-2 specific network comprising 182 nodes and 295 edges (including 171 basal edges). The solid edges (magenta) represent DREs for SARS-CoV-2 (MOI: 2). Thedot edges represent the basal edges (grey). The size of the node represents the extent of outdegree. The nodes (green) are target genes for existing drugs (SupplementaryTable S1). The top 10 terms of canonical pathway analysis for the genes of ∆ ECv-extracted DREs exclusive for SARS-CoV-2 in the Venn diagram (Supplementary Fig. S1B).Tanaka et al.et al.
Supplementary Fig. S1 . Network comparison analyses across four respiratory viruses. ( A ) The histograms of ∆ ECv for each respiratory virus as indicated: SARS-CoV-2(MOI: 2), HPIV3, IAV, and RSV. ∆ ECvs were calculated following the Eq. (2) where S = virus-infected and T = corresponding mock samples for each virus (see Methods).The X-axis corresponds to the threshold for each ∆ ECv. The Y-axis stands for the number of edges on a log scale. ( B ) The Venn diagram represents the numbers of ∆ ECv-extracted edges for all respiratory viruses with a ∆ ECv threshold of 1.0. ( C ) The respiratory viruses-shared network comprised of 62 nodes and 116 edges (including53 basal edges). The colored solid edges represent DREs; SARS-CoV-2 ∩ IAV ∩ HPIV3 ∩ RSV (purple), SARS-CoV-2 ∩ RSV ∩ HPIV3 (red), IAV ∩ RSV ∩ HPIV3 (blue).The top 10 terms of canonical pathway analysis for the genes of ∆ ECv-extracted DREs shared by at least three viruses in the Venn diagram (Supplementary Fig. S1B). ( D )SARS-CoV-2 specific network comprising 182 nodes and 295 edges (including 171 basal edges). The solid edges (magenta) represent DREs for SARS-CoV-2 (MOI: 2). Thedot edges represent the basal edges (grey). The size of the node represents the extent of outdegree. The nodes (green) are target genes for existing drugs (SupplementaryTable S1). The top 10 terms of canonical pathway analysis for the genes of ∆ ECv-extracted DREs exclusive for SARS-CoV-2 in the Venn diagram (Supplementary Fig. S1B).Tanaka et al.et al. | SARS-CoV-2 network analysis arXiv | 13 B s i m il a r i t y Supplementary Fig. S2 . Multiple analyses for generating the COVID-19-perturbated network. ( A ) The similarity heatmap is shown and samples for comparisons are labeledas indicated. Similarity is calculated with cosine distance method for ECvs of the 167 DREs. ( B ) The COVID-19 patient-specific network in combination with the Venndiagram analysis (Fig. 5D). The network is composed of 127 nodes and 412 edges (including 245 basal edges). The colored solid edges represent DREs; SARS-CoV-2(high MOI: 2) ∩ COVID-19-perturbated (magenta), SARS-CoV-2 (low MOI: 0.2) ∩ COVID-19-perturbated (blue), SARS-CoV-2 (high MOI: 2) ∩ SARS-CoV-2 (low MOI: 0.2) ∩ COVID-19-perturbated (purple), COVID-19-perturbated exclusive edges (yellow). The dot edges represent the basal edges (grey). The nodes (green) represent the knowndrug target genes (Supplementary Table S1). The node size represents the extent of outdegree.14 | arXiv Tanaka et al. | SARS-CoV-2 network analysis ndividual networks sample A 𝑋 ! 𝑋 " 𝑋 𝑋 $ sample Bsample C 𝑋 ! 𝑋 " 𝑋 𝑋 $
1. Bayesian networkestimation
𝑃 𝑋 ! ,𝑋 " ,𝑋 ,𝑋 $ ,𝑋 % ,𝑋 & ,𝑋 ’ = 𝑃 𝑋 ! 𝑃 𝑋 " 𝑃 𝑋 𝑋 ! 𝑃 𝑋 $ 𝑋 ! ,𝑋 " 𝑃 𝑋 % 𝑃 𝑋 & 𝑋 𝑃 𝑋 ’ 𝑋 ,𝑋 $ ,𝑋 % = % ()!* 𝑃 𝑋 𝑃 + 𝑋 ( 𝑋 ! 𝑋 " 𝑋 𝑋 $ 𝑋 " 𝑋 𝑋 $ 𝑋 ! 𝑋 $(&) = 𝑚 ! 𝑋 !(&) + 𝑚 " 𝑋 "(&) + 𝑚 𝑋
3. RC calculation2. Core network extraction by ECv
ΔECv-extracted core network
𝑬𝑪𝒗 𝑿 → 𝑿 𝑬𝑪𝒗 𝑿 → 𝑿 𝑬𝑪𝒗 𝑿 → 𝑿 Nonparametric regression ΔECv calculation transcriptome
𝑹𝑪 ≅ 𝑬𝑪𝒗 𝑿 → 𝑿 ∶ 𝑬𝑪𝒗 𝑿 → 𝑿 ∶ 𝑬𝑪𝒗 𝑿 → 𝑿 Supplementary Fig. S3 . Mathematical illustration of our proposed framework for the gene network analysis. The centered hairball (blue) represents a basal network. Thenetwork (magenta) is a core network extracted by the ∆ ECv calculation.Tanaka et al.et al.