The natural polyphenol fortunellin and its structural analogs are inhibitors of the SARS-CoV-2 main proteinase dimerization, as revealed by molecular simulation studies
Athanasios A. Panagiotopoulos, Danai-Maria Kotzampasi, George Sourvinos, Marilena Kampa, Stergios Pirintsos, Elias Castanas, Vangelis Daskalakis
11 Natural polyphenols inhibit the dimerization of the SARS-CoV-2 main protease: the case of fortunellin and its structural analogs
Athanasios A. Panagiotopoulos * , Danai-Maria Kotzampasi * , George Sourvinos †, , Marilena Kampa* , , Stergios Pirintsos ‡,&, , Elias Castanas* , , Vangelis Daskalakis ¶,ǁ * Laboratory of Experimental Endocrinology, University of Crete, School of Medicine, Heraklion, Greece † Laboratory of Virology, University of Crete, School of Medicine, Heraklion, Greece ‡ Department of Biology, University of Crete, 71409 Heraklion, Greece & Botanical Garden, University of Crete, Rethymnon, Greece ¶ Department of Chemical Engineering, Cyprus University of Technology, Limassol, Cyprus Address all correspondence to:
Dr Elias Castanas, University of Crete, School of Medicine, Voutes University Campus, Heraklion, 71013, Greece ([email protected]) Dr Vangelis Daskalakis, Department of Chemical Engineering, Cyprus University of Technology, Limassol, 3041, Cyprus ([email protected])
Keywords SARS-CoV-2; COVID-19; Molecular Simulations; Metadynamics; Natural Products; Fortunellin
Abstract in silico , through binding and molecular dynamics simulations, the natural product space for the identification of candidates of 3CL-Pro dimerization inhibitors. We report that fortunellin (acacetin 7-O-neohesperidoside), a natural flavonoid O-glycoside, is a potent inhibitor of 3CL-Pro dimerization. A search of the ZINC natural products database identified another 16 related molecules, including apilin and rhoifolin, with interesting pharmacological properties. We propose that fortunellin and its structural analogs might be the basis of novel pharmaceuticals and dietary supplements against SARS-CoV-2 induced COVID-19 disease. Our findings are supported by the experimental literature.
Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was first identified in December 2019, and is the causative agent of coronavirus disease 2019 (COVID-19). It became a global pandemic, threatening the lives of millions of people belonging to sensitive health groups. Intense scientific effort worldwide resulted in the identification of SARS-CoV-2 genomic structure, viral protein sequence and structure and disease characteristics.[1,2] 3CL-Pro (or Mpro), the main protease of SARS-CoV-2, plays a key role in polyprotein processing. This enzyme plays a vital role in cleaving the large polyprotein 1ab (replicase 1ab, ∼
790 kDa) translated by the virus RNA, at 11 different sites, liberating proteins indispensable for viral replication and proliferation, with a unique specificity, not found in any human protease.[3] Therefore, it represents a preferential target for the development of a series of not toxic inhibitors. The individual monomers of SARS-CoV-2 3CL-Pro are inactive; thus, inhibitors of its dimerization are needed. The chemical space of natural products, and especially that of polyphenols, has provided valuable candidate lead molecules in a number of diseases, including COVID-19.[4] Here, we have interrogated this resource, in view of identifying potential candidates inhibiting the dimerization of SARS-CoV-2 3CL-Pro. Herein, Fortunellin (acacetin 7-O-neohesperidoside), a natural flavonoid O-glycoside, isolated from the fruits of
Citrus japonica var. margarita (kumquat),[5] was found as a lead compound, which, together with a series of 16 structurally related analogs, including apilin and rhoifolin, could be used as the basis for the design of novel antiviral compounds.
Material and Methods
Table S1 in the Supplemental Material). Here, we base our docking study on the crystal with reference 6YB7, in the absence of a ligand. For the study, ligands and their analogs were retrieved from the ZINC database (http://zinc.docking.org/).[6] Fully flexible ligand binding on 3CL- Pro by the GalaxyWeb server (http://galaxy.seoklab.org/), was performed as described previously.[7] Molecular Dynamics (MD) studies were based on two unligated crystals (6YB7, 6LU7) [8] to increase phase space sampling. Classical MD simulations of the 3CL-Pro monomer (60μs) described by the Amber03 force field were performed in GROMACS 2020. Retrieved trajectories were further analyzed by Markov state modeling (MSM).[9] Time-structure independent components analysis (tICA) was used to reduce the dimensionality of our data, in PyEMMA. MSM gave the conformational phase space of MPro, in two Collective Variables (CV-1/ CV-2) determined by the torsional angles of residues 3, 4, 5, 6, 84, 135, 141, 164, 167, 171, 175, 178, 179, 180, 190, 195, 217, 284, 285, 286, 290, 291, 300, and 301. Please, refer to the Supplemental Material for a detailed description of the methods used in this study and the detailed conformational space functions out of the MSM analysis.
Results and Discussion
The Classical MD of the homodimer of 3CL-Pro (10μs), followed by 20μs enhanced sampling (parallel tempering metadynamics at the well-tempered ensemble, PTmetaD-WTE method)[10], on the above CV phase space, revealed three conformations (
Figure 1A ), at the minima of the free energy surface (FES). The first conformation (C1) corresponds to the crystal structure of the dimer (monomer distance 1.72-1.78 Å), while the other two (C2-C3) correspond to looser dimer structures with higher (1.86-1.93 Å) distances among monomers, resulting in weakened monomer-monomer interactions. The monomer distances have been calculated based on the minimum distances between residues on the monomer-monomer interface (4, 10, 11, 14, 28, 139, 140, 147, 290, 298).[11] We utilized the FTMap server [12] to identify binding hot spots, determine drugability and provide information about fragment-based drug discovery on 3CL-Pro. The results of FTMap permit us to design a minimal structure of a potential 3CL-Pro dimerization inhibitor. Interrogating the ZINC database of natural products with this minimal structure, we have identified fortunellin (ZINC4349204,
Figure 1B ) as a potential natural inhibitor of 3CL-Pro dimerization. Binding of fortunellin on the 3CL-Pro monomer (
Figure 1C ) revealed a high affinity binding (ΔG -13.9 kcal/mol) using the fully flexible GalaxyWeb server, and strong interactions with 3CL-Pro amino acids (Leu , Asp , Asp , Val , Tyr , Gln , Lys , Tyr , Lys , Phe , Val , Arg , Asp , Phe , Cys , Asp , Leu and Glu ). Despite the fact that fortunellin does not bind at the dimerization interface, as defined by the residues 4, 10, 11, 14, 28, 139, 140, 147, 290, and 298, the binding might allosterically inhibit, or weaken the formation of the active dimer, as indicated by the large-scale classical and enhanced sampling MD simulations ( see below ). To increase the statistical significance of the docking studies, we have also used the different poses of the monomeric 3CL-Pro MD solutions for 60 μs, at 1 ns interval (poses were retrieved every 3 μs) as scaffolds for fortunellin binding. As shown, the affinity of fortunellin is fluctuating around -12.3 kcal/mol (SD ±1.005 kcal/mol), and, after initial fluctuations, preceding significant changes of the dimerization domain, as expressed by changes of the local RMSD value, as compared to the crystal structure of 3CL-Pro, it stabilizes after 30 μs (see Supplemental Material Table S2 ). In
Figures 1D-E , we present the MSM analysis of the equilibrium dynamics of MPro in the presence and absence of fortunellin, respectively on the MSM refined CV phase space (CV-1/ CV-2). The C1-C3 minima are assigned in these graphs based on structural feature comparison with
Figure 1A . Based on MSM, we calculated that the transitions between the different states (C1-C3) occur at the average time scales of 37.6ns (slow processes) and 3.5ns (fast processes) in the absence of fortunellin (
Figure 1F ). However, the average transition time scale in the presence of fortunellin drops to 1.42ns with only fast transitions (
Figure 1G ). This significantly smaller time-interval results in greater energy changes faster in the presence of fortunellin, blocking the trapping at certain states, and inhibiting the formation of the dimer by sampling non-favorable monomer conformations, i.e. with greater distance among monomers, that favor its dissociation. We note that the C1 minimum that is associated with the crystal structure (active for dimerization) in
Figure 1D , it is absent in
Figure 1E , indicating that fortunellin disfavors this state. Instead, C1 is replaced by an alternative protein state, indicated by the purple area in
Figure 1E . Figure 1 | A. The Free Energy Surface (FES) out of the PTmetaD-WTE enhanced sampling runs. Three minima of the 3CL-Pro dimer dynamics are identified at the blue regions (C1, C2, and C3) on the CV-1/ CV-2 phase space. B. The molecular structure of fortunellin. C. The interaction of fortunellin with the 3CL-Pro monomer. Fortunellin is shown in green, the interacting amino acids are shown in yellow and the dimerization interacting aminoacids are shown in blue. D. The identified states of 3CL-Pro in the absence of fortunellin within CV-1, CV-2 phase space. E. The identified states of 3CL-Pro in the presence of fortunellin within the refined CV-1, CV-2 phase space. The C1 to C3 minima are indicated on the D-E graphs based on the comparison with Figure 1A.
F-G.
The transition times between the C1 to C3 minima are calculated in the absence ( F ) and in the presence ( G ) of fortunellin. The 3CL-Pro C44-P52 loop has been proposed to host mutations, however only at the T45, S46, E47, L50 positions.[13] The latter are not listed as important residues in the 3CL-Pro/ fortunellin dynamics analyzed by MSM (see Supplemental Material). This gives us confidence that fortunellin can target 3CL-Pro, even when 3CL-Pro is mutated. It appears therefore that the natural polyphenol fortunellin is a good drug (or dietary supplement) candidate for combatting COVID-19 disease. The bulk of the scientific effort targeting 3CL-Pro focuses on the discovery or repurposing of inhibitors of its enzymatic activity. [14]. Here, we have instead directed our efforts toward the identification of (natural) compounds, which could inhibit the dimerization of the enzyme. Previous studies have implicated fortunellin as an activator of anti-oxidant enzymes (HO-1, SOD and CAT), through a direct action on Nrf2 and AMPK pathways, considered as important to protect against oxidative stress.[15] In addition, fortunellin was implicated as a cardioprotective factor in diabetic animals.[16] Here, we extend the actions of this agent, by reporting a direct effect of fortunellin, impairing the dimerization of 3CL-Pro SARS-CoV-2 protease, and therefore inhibiting SARS-CoV-2 replication and proliferation. Searching the ZINC database for natural products with fortunellin as a bait, we have retrieved 16 natural compounds with similar molecular properties, presented in
Table S3 (Supplemental Material) . Their absorption, distribution, metabolism, and excretion (ADME) characteristics were evaluated in the SwissADME site, and the results are shown in Supplemental Material
Table S4 . It derives that they are all water-soluble compounds, with limited enteric and skin absorption, but they are non-toxic, as they are not predicted to interact with the CYP drug metabolizing enzymes. Among them, apiin (ZINC3983878) and rhoifolin (ZINC3978800) have been previously studied for their biological effects. For further information please refer to the Supplemental Material section “Natural product Properties: the fortunellin – analogs”. Our in-silico strategy identified a series of natural flavonoids – polyphenols (fortunellin, apiin, rhoifolin), which, in the form of drugs or dietary supplements, might be an effective strategy against the devastating SARS-CoV-2 infections, in humans.
Acknowledgements
We acknowledge PRACE for awarding us access to Joliot-Curie at GENCI@CEA (Irene), France, through the “PRACE support to mitigate impact of COVID-19 pandemic” call and the project “Epitope vaccines based on the dynamics of mutated SARS-CoV-2 proteins at all atom resolution”. We also acknowledge Greece and the European Union (European Social Fund- ESF) for funding through the Operational Programme «Human Resources Development, Education and Lifelong Learning» in the context of the project “Strengthening Human Resources Research Potential via Doctorate Research” (MIS-5000432), implemented by the State Scholarships Foundation (ΙΚΥ)» to AAP (PhD scholarship) and a Hellenic Foundation for Research and Innovation (H.F.R.I.) Grant to MK (
TOC Graph:
References
1. Wu, F.; Zhao, S.; Yu, B.; Chen, Y.M.; Wang, W.; Song, Z.G.; Hu, Y.; Tao, Z.W.; Tian, J.H.; Pei, Y.Y.; et al. A new coronavirus associated with human respiratory disease in China.
Nature , , 265–269, doi:10.1038/s41586-020-2008-3. 2. Zhou, F.; Yu, T.; Du, R.; Fan, G.; Liu, Y.; Liu, Z.; Xiang, J.; Wang, Y.; Song, B.; Gu, X.; et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet , , 1054–1062, doi:10.1016/S0140-6736(20)30566-3. 3. Hilgenfeld, R. From SARS to MERS: crystallographic studies on coronaviral proteases enable antiviral drug design. FEBS J , , 4085–4096, doi:10.1111/febs.12936. 4. Suárez, D.; Díaz, N. SARS-CoV-2 Main Protease: A Molecular Dynamics Study. J. Chem. Inf. Model. , doi:10.1021/acs.jcim.0c00575. 5. Lou, S.N.; Lai, Y.C.; Hsu, Y.S.; Ho, C.T. Phenolic content, antioxidant activity and effective compounds of kumquat extracted by different solvents.
Food Chem , , 1–6, doi:10.1016/j.foodchem.2015.10.096. 6. Sterling, T.; Irwin, J.J. ZINC 15--Ligand Discovery for Everyone. J Chem Inf Model , , 2324–2337, doi:10.1021/acs.jcim.5b00559. 7. Panagiotopoulos, A.A.; Papachristofi, C.; Kalyvianaki, K.; Malamos, P.; Theodoropoulos, P.A.; Notas, G.; Calogeropoulou, T.; Castanas, E.; Kampa, M. A simple open source bio-informatic method for initial exploration of GPCR ligands’ agonistic/antagonistic properties. Pharmacol Res Perspect , (in press). 8. Duan, Y.; Wu, C.; Chowdhury, S.; Lee, M.C.; Xiong, G.; Zhang, W.; Yang, R.; Cieplak, P.; Luo, R.; Lee, T.; et al. A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations.
J Comput Chem , , 1999–2012, 0 doi:10.1002/jcc.10349. 9. Prinz, J.-H.; Wu, H.; Sarich, M.; Keller, B.; Senne, M.; Held, M.; Chodera, J.D.; Schütte, C.; Noé, F. Markov models of molecular kinetics: Generation and validation. J. Chem. Phys. , , 174105, doi:10.1063/1.3565032. 10. Bussi, G.; Gervasio, F.L.; Laio, A.; Parrinello, M. Free-Energy Landscape for β Hairpin Folding from Combined Parallel Tempering and Metadynamics. J. Am. Chem. Soc. , , 13435–13441, doi:10.1021/ja062463w. 11. Goyal, B.; Goyal, D. Targeting the Dimerization of the Main Protease of Coronaviruses: A Potential Broad-Spectrum Therapeutic Strategy. ACS Comb Sci , , 297–305, doi:10.1021/acscombsci.0c00058. 12. Kozakov, D.; Grove, L.E.; Hall, D.R.; Bohnuud, T.; Mottarella, S.E.; Luo, L.; Xia, B.; Beglov, D.; Vajda, S. The FTMap family of web servers for determining and characterizing ligand-binding hot spots of proteins. Nat Protoc , , 733–755, doi:10.1038/nprot.2015.043. 13. Bzowka, M.; Mitusinska, K.; Raczynska, A.; Samol, A.; Tuszynski, J.A.; Gora, A.; Bzówka, M.; Mitusińska, K.; Raczyńska, A.; Samol, A.; et al. Structural and Evolutionary Analysis Indicate That the SARS-CoV-2 Mpro Is a Challenging Target for Small-Molecule Inhibitor Design. Int J Mol Sci , , doi:10.3390/ijms21093099. 14. Callaway, E.; Ledford, H.; Mallaparty, S. Six months of coronavirus: The mysteries scientists are still racing to solve. Nature , , 178–179. 15. Cuadrado, A.; Manda, G.; Hassan, A.; Alcaraz, M.J.; Barbas, C.; Daiber, A.; Ghezzi, P.; Leon, R.; Lopez, M.G.; Oliva, B.; et al. Transcription Factor NRF2 as a Therapeutic Target for Chronic Diseases: A Systems Medicine Approach. Pharmacol Rev , , 348–383, doi:10.1124/pr.117.014753. 16. Zhao, C.; Zhang, Y.; Liu, H.; Li, P.; Zhang, H.; Cheng, G. Fortunellin protects against high fructose-1 induced diabetic heart injury in mice by suppressing inflammation and oxidative stress via AMPK/Nrf-2 pathway regulation. Biochem Biophys Res Commun , , 552–559, doi:10.1016/j.bbrc.2017.06.076. atural polyphenols inhibit the dimerization of the SARS-CoV-2 main protease: the case of fortunellin and its structural analogs Athanasios A. Panagiotopoulos * , Danai-Maria Kotzampasi * , George Sourvinos †, , Marilena Kampa* , , Stergios Pirintsos ‡,&, , Elias Castanas* , , Vangelis Daskalakis ¶,ǁ * Laboratory of Experimental Endocrinology, University of Crete, School of Medicine, Heraklion, Greece † Laboratory of Virology, University of Crete, School of Medicine, Heraklion, Greece ‡ Department of Biology, University of Crete, 71409 Heraklion, Greece & Botanical Garden, University of Crete, Rethymnon, Greece ¶ Department of Chemical Engineering, Cyprus University of Technology, Limassol, Cyprus Supplemental Material ontents
Methods ................................................................................................................................................ 3
Molecular Docking ............................................................................................................................. 3
SARS-CoV-2 main protease MPro and Ligand Preparation ............................................................ 3
Ligand-MPro docking ..................................................................................................................... 3
Simulation of MPro dimerization ................................................................................................... 3
Molecular Dynamics computational protocol ................................................................................... 4
Model setup ................................................................................................................................... 4
Equilibration-Production Molecular Dynamics setup .................................................................... 4
Markov State Model Analysis ........................................................................................................ 5
Enhanced Molecular Dynamics Sampling ...................................................................................... 9
Natural product properties: the fortunellin – analogs ........................................................................ 10
Supplemental Table S1 ........................................................................................................................ 11
Supplemental Table S2 ........................................................................................................................ 12
Supplemental Table S3 ........................................................................................................................ 14
Supplemental Table S4 ........................................................................................................................ 18
Supplemental Figure S1 ....................................................................................................................... 19
Supplemental Figure S2 ....................................................................................................................... 20
References ........................................................................................................................................... 21 ethods
Molecular Docking
SARS-CoV-2 main protease MPro and Ligand Preparation
The sequence of SARS-CoV-2 main protease
Ligand-MPro docking
SARS-CoV-2 main protease (in pdb format) and ligand(s) (in mol2 or pdb formats) were uploaded to the GalaxyWebserver (http://galaxy.seoklab.org/) and a fully flexible docking (involving the receptor and the ligand) was performed.[5–10]
The server uses an algorithm, based on the GalaxyDock2 docking,[11] which, after an automatic prediction of the ligand binding pocket, permits a full ligand/receptor flexibility during binding simulation. This step is followed by optimization and subsequent refinement, through a specific algorithm named GalaxyRefine,[6,12] which permits a protein-ligand structure refinement, by applying iterative side chain repacking and overall structure relaxation.[6,13]
Simulation of MPro dimerization
1) Dimerization of MPro: The active form of the MPro SARS-CoV-2 main protease, is a homodimer,[15] as a Cyclic-C2 (Global Symmetry) Homo-2-mer A2 (Global Stoichiometry).[16–19] Here, we used two different docking programs to simulate the active dimer and we compare the obtained solutions with the crystal 6YB7: (1) We used the GalaxyWEB server, routine HOMOMER (http://galaxy.seoklab.org/cgi-bin/submit.cgi?type=HOMOMER) for the MPro protein oligomer structure prediction from a monomer sequence or structure (Ologomeric state = 2). The server carries out template-based modeling, ab initio docking or both, depending on the availability of proper oligomer templates, until 5 models are generated.[5] (2) We used the Hex 8.0.8 program (http://hex.loria.fr/).[20,21] Hex 8.0.8, a specialized, locally executed, program, for protein-protein, or protein-nucleic acid interactions, based on spherical otated protein complexes, taking into account both surface shape and electrostatic charge. Hex returns, through a graphical user interface, a set of >100 solutions, with the corresponding ΔG values. The input files for Hex 8.0.8 program are pdb files of MPro with or without ligand and the output file is a pdb file that contains the simulated MPro dimer. The 3D structures of the monomer and dimer structure of MPro from experimental crystal structures, GalaxyWEB server and Hex 8.0.8 program were compared, using the UCSF Chimera 1.11.2 program. The same program was also used to determine the dimerization interphase of the monomers.
Molecular Dynamics computational protocol
Model setup
The crystal structures of SARS-CoV-2 main protease
MPro, or 3CL-PRO (pdb codes 6LU7[22] and 6YB7) were used as initial coordinates to build the models. The pdb structures refer to one monomer without and with inhibitor, respectively. For consistency, the inhibitor was removed from the crystal structure of MPro (6LU7).[22] Our choice of coordinates was based on the completeness of the resolved MPro sequence and the quality of chains (at least 90%). To build the MPro dimer, we structurally aligned either two 6lu7, or two 6yb7 monomers on a reference dimer.[23] The protonation states of titratable residues were simulated at neutral pH, thus all Glu, and Asp residues were left deprotonated, except Glu-290 which was protonated, in accordance also with the PDB2PQR (propka 3.0 method, pH 7.3) predictions.[24] His-41, His-163, His-172 and His-246 were protonated only at the Nε site. The rest of His residues were protonated only at the Νδ sites, to maintain the hydrogen bonding network within the crystal structures. All crystallographic water molecules are retained within each crystal structure. Four samples were, thus prepared, in a consistent way (two monomers, two dimers). The all-atom models, as defined previously, were embedded in triclinic boxes of around 7.2nm x 11.2nm x 8.0nm (monomer), or 12.3nm x 12.3 nm x 12.3 nm (dimer) in the x, y and z dimensions, respectively. Up to around 57000 TIP3P water molecules[25] were used to hydrate each protein. Ion (K + , Cl - ) concentration was set at the value of 150 mM to mimic the physiological salt content for the monomer or dimer models, in addition to zero, or excess concentrations of KCl, NaCl, CaCl at 0, 150, 300, 400 and 500 mM only for the monomer models. The various anionic strengths were only employed to indirectly ‘enhance’ the sampled conformational space of the MPro. A surplus of K + was also added to neutralize the protein charges in each sample, resulting in simulation unit boxes of 62400 (monomer), or 181800 (dimer) atoms. The Amber03[26] protein force field was used for the residues and ions. The Amber03 parameters for the natural products were derived based on ACPYPE algorithm.[27] Equilibration-Production Molecular Dynamics setup
The equilibration-relaxation for the all-atom systems is employed based on a published protocol for water-soluble proteins.[28] This contains a steepest descend energy minimization with a tolerance of 0.5 kJ mol -1 for 1000 steps, and a sequence of isothermal (nVT), isothermal-isobaric (nPT) runs with the gradual relaxation of the constraints on protein eavy atoms (from 10 in steps1-2 to 10 kJ mol -1 nm -2 in step-4) and Cα atoms (from 10 in step-5, to 10 in step-6, 10 in step-7, 1 in step-8 and 0 kJ mol -1 nm -2 in step-9) for around 30 ns, with a time step of 1.0 fs (steps 1-4) and 2.0 fs (steps 5-9). In detail: ( step-1 ) Constant density and temperature (nVT) Brownian dynamics (BD) at 100 K for 50 ps that employs the Berendsen thermostat,[29] with a temperature coupling constant at 1.0 fs. ( steps 2-3 ) Two short constant density (nVT) and constant pressure (nPT) runs for 100 ps each, with a weak coupling Berendsen thermostat and barostat[29] at 100 K employing time coupling constants of 0.1 ps for the temperature and isotropic 50.0 ps coupling for the pressure with a compressibility of 4.6x10 -5 . ( step-4 ) Heating from 100 to 250 K in a constant density ensemble (nVT) for 3 ns employing the v-rescale thermostat,[30] with a time coupling constant of 0.1 ps. ( step-5 ) Heating from 250 to 310K in a constant pressure ensemble (nPT) for 2 ns, employing the v-rescale thermostat[30] and Berendsen barostat,[29] with time coupling constants of 0.1 ps for the temperature and 2.0 ps for the pressure, removing also all but the Cα-atom protein position restraints. ( step-6 ) Equilibration at 310K (0.1 ps temperature coupling constant) for 5 ns (nPT, 1 atm, 2.0 ps coupling constant for pressure. ( steps 7-8 ) Equilibration at 310K (0.5 ps temperature coupling constant) for 5 ns (nPT, 1 atm, 2.0 ps coupling constant for pressure). ( step-9 ) Equilibration at 310K (0.5 ps temperature coupling constant) for 10 ns (nPT, 1 atm, 2.0 ps coupling constant for pressure). The barostats – thermostats employed for steps 6-9 were the same as in the production trajectories that follow. For the production all-atom classical Molecular Dynamics (MD), the Newton’s equations of motion are integrated with a time step of 2.0 fs at 310K. All production simulations are run with the leap-frog integrator in GROMACS 2020[31] for 3.0 μs each. They were performed at the constant pressure nPT ensemble, with isotropic coupling (compressibility at 4.5x10 -5 ) employing the v-rescale thermostat[30] (310K, temperature coupling constant 0.5) and the Parrinello - Rahman barostat[32] (1 atm, pressure coupling constant 2.0). Details for parameters can be found in earlier work.[28] The first 500 ns were considered further equilibration from each independent trajectory per sample, and were disregarded in the analysis. Van der Waals interactions were smoothly switched to zero between 1.0-1.2 nm with the VERLET cut-off scheme. Electrostatic interactions were truncated at 1.2 nm (short-range) and long-range contributions were computed within the PME approximation.[33,34] Hydrogen bond lengths were constrained employing the LINCS algorithm.[35]
Markov State Model Analysis
We obtained a series of MD equilibrium trajectories of the MPro monomers (60μs) of SARS-CoV-2 main protease under different salts/ ionic strengths, without inhibitors. These should have explored a major part of the MPro conformational phase space. We combined the all-atom MD simulations with Markov state model (MSM) theory[36–38] in order to enable the extraction of long-time-scale individual monomer dynamics from rather short-time-scale MD trajectories of different states. The application and accuracy of the powerful MSM theory has been presented in many cases also by experiments that include protein−protein, or protein-rug binding kinetics, as well as protein folding rates and protein dynamics.[39–42] Our objective was to approximate the slow dynamics in a statistically efficient manner. Thus, a lower dimensional representation of our simulation data was necessary. In order to reduce the dimensionality of our feature space, we employed the time-structure independent components analysis (tICA) which yields a representation of our molecular simulation data with a reduced dimensionality and can greatly facilitate the decomposition of our system into the discrete Markovian states necessary for MSM estimation. The conformations of the system were projected on these slowest modes as defined by the tICA method, then the trajectory frames were clustered into 100 cluster-centers (microstates) by k-means clustering, as implemented in PyEMMA.[43] Conformational changes of a system can be simulated as a Markov chain, if the transitions between the different conformations are sampled at long enough time intervals so that each transition is Markovian. This means that a transition from one conformation to another is independent of the previous transitions. Therefore, an MSM is a memoryless model. The uncertainty bounds were computed using a Bayesian scheme.[44,45] We found that the slowest implied timescales converge quickly and are constant within a 95% confidence interval for lag times above 50ns. The validation procedure is a standard approach in the MSM field., a lag time of 50 ns was selected for Bayesian model construction, and the resulting models were validated by the Chapman-Kolmogorov (CK) test. Subsequently, the resulting MSMs were further coarse grained into a smaller number of three metastable states or microstates, using PCCA++ as implemented in PyEMMA.[43] The optimum number of microstates (three) was proposed based on the VAMP2-score.[46] Both the convergence of the implied timescales, as well as the CK test confirm the validity and convergence of the MSM. The CK test indicates that predictions from the built MSM agree well with MSMs estimated with longer lag times. Thus, the model can describe well the long-time-scale behavior of our system within error. The tICA method identified the torsional angles of the following MPro residues: 3, 4, 5, 6, 84, 135, 141, 164, 167, 171, 175, 178, 179, 180, 190, 195, 217, 284, 285, 286, 290, 291, 300, and 301 as the most important features, by setting a series of thresholds for the coefficients in the tICA vectors. At first, we set a threshold of 0.09. We continued by setting a threshold of 0.04 for the coefficients in the tICA vectors of the filtered data and afterwards a threshold of 0.085. Finally, we set a threshold of 0.075 and thus we concluded in the previously referred MPro residues. For the selection of these thresholds we checked for different thresholds the VAMP2-score and the states projected onto the first two independent components. We report the exact functions of the first two tICA components, as a linear combination of cosine/ sine functions of the associated torsionals:
CV1 = (0.22245479085963174)*COS(PHE3PHI)+(-0.16733307515910162)*SIN(PHE3PHI)+(0.1432275835003262)*COS(PHE3PSI)+(0.2008868998696375)*SIN(PHE3PSI)+(0.19936685751147043)*COS(ARG4PHI)+(-0.09327647115497929)*SIN(ARG4PHI)+(-0.0930358300031665)*COS(ARG4PSI)+(-0.08538905470780045)*SIN(ARG4PSI)+(0.10166329580502668)*COS(LYS5PHI)+(-0.09568173783678022)*SIN(LYS5PHI)+(-0.09287774945984135)*COS(LYS5PSI)+(-.15790698786616095)*SIN(LYS5PSI)+(-0.0956189898931311)*COS(MET6PHI)+(0.025245999648597445)*SIN(MET6PHI)+(0.040877182731096175)*COS(MET6PSI)+(-0.0006973055782739627)*SIN(MET6PSI)+(0.12585421365826116)*COS(ASN84PHI)+(0.1980318831213358)*SIN(ASN84PHI)+(0.08785335907805712)*COS(ASN84PSI)+(-0.1678608618927701)*SIN(ASN84PSI)+(0.3615169358035203)*COS(THR135PHI)+(0.2355840179344413)*SIN(THR135PHI)+(0.06937744998780684)*COS(THR135PSI)+(-0.08624991337144639)*SIN(THR135PSI)+(0.06993246492427758)*COS(LEU141PHI)+(-0.023980684220480688)*SIN(LEU141PHI)+(-0.0028980923592885344)*COS(LEU141PSI)+(-0.11124570694911744)*SIN(LEU141PSI)+(0.12367941924553509)*COS(HIS164PHI)+(0.11002896260080477)*SIN(HIS164PHI)+(0.1736049601096124)*COS(HIS164PSI)+(-0.10071229902364524)*SIN(HIS164PSI)+(0.04589771207808335)*COS(LEU167PHI)+(0.03190381173082644)*SIN(LEU167PHI)+(0.010408062573537607)*COS(LEU167PSI)+(-0.001452825004362783)*SIN(LEU167PSI)+(-0.035449253177574046)*COS(VAL171PHI)+(0.0397831323983392)*SIN(VAL171PHI)+(-0.01487903934140085)*COS(VAL171PSI)+(-0.034102380825128384)*SIN(VAL171PSI)+(0.030847079909921624)*COS(THR175PHI)+(-0.020040648643118112)*SIN(THR175PHI)+(0.12177024626152966)*COS(THR175PSI)+(-0.13229839249150296)*SIN(THR175PSI)+(0.11507939509465313)*COS(GLU178PHI)+(0.101371439235159)*SIN(GLU178PHI)+(-0.09137875539957069)*COS(GLU178PSI)+(0.11214339094322172)*SIN(GLU178PSI)+(-0.09294920607798879)*COS(GLY179PHI)+(0.28997810094722837)*SIN(GLY179PHI)+(-0.27573006050671983)*COS(GLY179PSI)+(-0.09673320535071989)*SIN(GLY179PSI)+(0.010926172367439396)*COS(ASN180PHI)+(-0.24668920154301732)*SIN(ASN180PHI)+(-0.14912645633298552)*COS(ASN180PSI)+(0.3816102785675176)*SIN(ASN180PSI)+(-0.10964069317986454)*COS(THR190PHI)+(-0.009927774218497826)*SIN(THR190PHI)+(-0.16097973924205633)*COS(THR190PSI)+(-0.16639818413751062)*SIN(THR190PSI)+(0.08017482811083185)*COS(GLY195PHI)+(-0.14979887114706295)*SIN(GLY195PHI)+(-0.050958292897186695)*COS(GLY195PSI)+(-0.044354361874655926)*SIN(GLY195PSI)+(0.09987054135337944)*COS(ARG217PHI)+(-0.12648381947691048)*SIN(ARG217PHI)+(0.08158948319042672)*COS(ARG217PSI)+(-0.00021426690460247595)*SIN(ARG217PSI)+(0.21642336166830292)*COS(SER284PHI)+(-0.30155833144467264)*SIN(SER284PHI)+(0.8266406109913902)*COS(SER284PSI)+(0.8874880087814281)*SIN(SER284PSI)+(0.6649555215066218)*COS(ALA285PHI)+(0.44940445619426095)*SIN(ALA285PHI)+(-0.7837894441282054)*COS(ALA285PSI)+(-0.7618813787744232)*SIN(ALA285PSI)+(-0.41060021549010234)*COS(LEU286PHI)+(-0.10014769031761044)*SIN(LEU286PHI)+(0.19732367941536325)*COS(LEU286PSI)+(0.28641653714402343)*SIN(LEU286PSI)+(0.2011752565537491)*COS(GLU290PHI)+(0.08698680780975648)*SIN(GLU290PHI)+(-0.00570069571470415)*COS(GLU290PSI)+(0.06637101699247436)*SIN(GLU290PSI)+(0.22989634795972827)*COS(PHE291PHI)+(-0.17343642924439104)*SIN(PHE291PHI)+(-0.11571250082711405)*COS(PHE291PSI)+(0.02324511232530362)*SIN(PHE291PSI)+(-0.047478679669659275)*COS(CYS300PHI)+(0.05578968284901458)*SIN(CYS300PHI)+(-0.1683363737052317)*COS(CYS300PSI)+(-0.2973909304454135)*SIN(CYS300PSI)+(-.1038665536799251)*COS(SER301PHI)+(-0.019617745122528882)*SIN(SER301PHI)+(-0.06934626053471138)*COS(SER301PSI)+(-0.10587162831562456)*SIN(SER301PSI) CV2 = (-0.2884241825864551)*COS(PHE3PHI)+(0.1818110423100234)*SIN(PHE3PHI)+(-0.4677450443427827)*COS(PHE3PSI)+(-0.5483897439796918)*SIN(PHE3PSI)+(-0.3553107265200588)*COS(ARG4PHI)+(-0.22394962064864046)*SIN(ARG4PHI)+(-0.270179755725878)*COS(ARG4PSI)+(-0.3829402238055943)*SIN(ARG4PSI)+(-0.2042142888426065)*COS(LYS5PHI)+(-0.1669302788926349)*SIN(LYS5PHI)+(0.141973243784881)*COS(LYS5PSI)+(0.1421901057216299)*SIN(LYS5PSI)+(0.0846185601658726)*COS(MET6PHI)+(0.20450776328031908)*SIN(MET6PHI)+(-0.03312223507339774)*COS(MET6PSI)+(-0.08043115352475075)*SIN(MET6PSI)+(-0.027921418396744165)*COS(ASN84PHI)+(-0.08339140826693826)*SIN(ASN84PHI)+(-0.38318457142235807)*COS(ASN84PSI)+(-0.02014396174778721)*SIN(ASN84PSI)+(-0.261602759573732)*COS(THR135PHI)+(-0.22169542813755444)*SIN(THR135PHI)+(-0.087566095559111)*COS(THR135PSI)+(0.11762076691089288)*SIN(THR135PSI)+(-0.3469970284275915)*COS(LEU141PHI)+(0.34989926232122653)*SIN(LEU141PHI)+(-0.3353014777111922)*COS(LEU141PSI)+(-0.1099813590134305)*SIN(LEU141PSI)+(-0.030401089904733867)*COS(HIS164PHI)+(-0.12811671967132546)*SIN(HIS164PHI)+(-0.13436881803853637)*COS(HIS164PSI)+(0.17139820505530204)*SIN(HIS164PSI)+(-0.12200117274899505)*COS(LEU167PHI)+(-0.06149505441676782)*SIN(LEU167PHI)+(-0.06260851492073659)*COS(LEU167PSI)+(0.06084287820758976)*SIN(LEU167PSI)+(-0.006861594758294066)*COS(VAL171PHI)+(0.012827540881062702)*SIN(VAL171PHI)+(0.012464087205759892)*COS(VAL171PSI)+(-0.02648078463371077)*SIN(VAL171PSI)+(0.13636122184567023)*COS(THR175PHI)+(-0.16662872008089358)*SIN(THR175PHI)+(0.005026593046059509)*COS(THR175PSI)+(-0.020823512651369957)*SIN(THR175PSI)+(-0.15569457005305837)*COS(GLU178PHI)+(-0.12910762053786512)*SIN(GLU178PHI)+(0.195962425618426)*COS(GLU178PSI)+(-0.1674075865238855)*SIN(GLU178PSI)+(-0.05538839797914528)*COS(GLY179PHI)+(-0.2984068151828335)*SIN(GLY179PHI)+(0.4196490331550077)*COS(GLY179PSI)+(0.2097945401494899)*SIN(GLY179PSI)+(-0.021574332218825033)*COS(ASN180PHI)+(0.3273623222055997)*SIN(ASN180PHI)+(0.08717870414956289)*COS(ASN180PSI)+(-0.47359288885540496)*SIN(ASN180PSI)+(-0.04538558390983336)*COS(THR190PHI)+(-0.008683270082020167)*SIN(THR190PHI)+(0.271129910670838)*COS(THR190PSI)+(0.27549665651127636)*SIN(THR190PSI)+(0.0780842690641956)*COS(GLY195PHI)+(0.044233409296374225)*SIN(GLY195PHI)+(0.13353598315320972)*COS(GLY195PSI)+(0.04465259963400809)*SIN(GLY195PSI)+(0.027701779053662518)*COS(ARG217PHI)+(0.03397617632773808)*SIN(ARG217PHI)+(-0.055406779213062196)*COS(ARG217PSI)+(-0.15249136969809435)*SIN(ARG217PSI)+(0.15917442405868654)*COS(SER284PHI)+(0.07317395309230672)*SIN(SER284PHI)+(0.050094307429749026)*COS(SER284PSI)+(0.19617816860701598)*SIN(SER284PSI)+(0.1826732916695155)*COS(ALA285PHI)+(-0.035338601991516574)*SIN(ALA285PHI)+(-0.19924762948366187)*COS(ALA285PSI)+(-0.1564466277523662)*SIN(ALA285PSI)+(-0.0443686239963879)*COS(LEU286PHI)+(-0.10931048704118541)*SIN(LEU286PHI)+(0.45176938273388506)*COS(LEU286PSI)+(0.475120747963982)*SIN(LEU286PSI)+(0.1954750353189944)*COS(GLU290PHI)+(0.28698450313441914)*SIN(GLU290PHI)+(0.1178561297010931)*COS(GLU290PSI)+(-0.11293348016651641)*SIN(GLU290PSI)+(-0.23109368829696822)*COS(PHE291PHI)+(0.2905179445957612)*SIN(PHE291PHI)+(0.2414087396867373)*COS(PHE291PSI)+(-0.19323071296694158)*SIN(PHE291PSI)+(-0.06723387446469688)*COS(CYS300PHI)+(0.08472356860436044)*SIN(CYS300PHI)+(0.11286902793988102)*COS(CYS300PSI)+(0.33756361790231243)*SIN(CYS300PSI)+(0.07472554678618155)*COS(SER301PHI)+(-0.04927948273422832)*SIN(SER301PHI)+(0.13667888455911714)*COS(SER301PSI)+(0.1947392228687667)*SIN(SER301PSI)
The residues that seem to affect the MPro-fortunellin dynamics are derived based on the same MSM/ tICA protocol described before in the absence of the inhibitor. These include: 44, 48, 53, 82, 83, 84, 111, 112, 118, 137, 138, 139, 141, 159, 182, 238, 239, 240, 286, 287, 288, 289, 291. At first, we set a threshold of 0.075. We continued by setting a threshold of 0.12 for the coefficients in the tICA vectors of the filtered data, a threshold of 0.18, a threshold of 0.065 and afterwards a threshold of 0.135. Finally, we set a threshold of 0.125 and thus we concluded in the previously referred residues. However, the MPro states in the presence of fortunellin, reported in Figure 1E of the main manuscript, are derived based on the projection of the Mpro-fortunellin trajectory data on the aforementioned phase space of the CV1, CV2 functions.
Enhanced Molecular Dynamics Sampling
To enhance the conformational sampling on the MPro dimer we employed the parallel tempering metadynamics in the well-tempered ensemble (PTmetaD-WTE) method.[47–50] Nine replicas per sample were run at 310, 320, 330, 341, 352, and 363K, 375K, 387K and 400K in which only the potential energy (PE) was initially biased (bias factor 120) to achieving large fluctuations in PE and replica overlaps. Replicas were allowed to exchange every 1000 steps for 0.2μs each, which gave an exchange probability around 20% in the WTE. The obtained bias was saved and used for the subsequent PTmetaD production runs for another 0.5μs per sample/ replica. Nine replicas were again considered at the same temperatures. Including the equilibration time at reach replica, a cumulative simulation time of 10μs was achieved. An exchange was attempted every 1000 steps, that gave an exchange probability between replicas at around 20%, consistent with the large sample sizes. The Collective Variables (CVs) chosen for the PTmetaD runs were the first two tICA vectors presented above (CV1/ CV2). A combination of the GROMACS 2020/ PLUMED 2.5[51] engines was employed. A bias factor of 25 at the well-tempered ensemble, along with Gaussians of 1.2 kJ/mol initial height, and sigma values (width) of 0.25 in the CV space, deposited every 2 ps, was employed. The grid space for both CVs is defined between -4 to 4 at a resolution of 0.05. Four different PTmetaD-WTE runs were performed for the pdb 6yb7/ 6lub7 -based dimers at 150mM KCl without inhibitor.
Natural product properties: the fortunellin – analogs
Apiin (apigenin-7-apioglucoside), is a natural flavonoid, a diglycoside of the flavone apigenin, isolated from leaves of
Apium graveolens var. dulce (celery) and
Petroselinum crispum (parsley). Apiin extracted from celery exhibited anti-inflammatory properties, as apiin showed strong inhibitory activity on inducible nitric oxide synthase (iNOS) expression and nitrite (NO) production when added before LPS stimulation of J774.A1 cells.[52] In mice models, apiin had a remarkable scavenging activity on maleic dialdehyde (MDA) and lipofuscin (LPF), promoted total antioxidant capacity (TAOC) and significantly enhanced the activities of superoxide dismutase (SOD), glutathione peroxidase (GSH-Px) and catalase (CAT),[53] by exerting a radical scavenging activity greater than that of absorbic acid[54] and Vitamin E.[53] Apiin also showed a marked antihypertensive effect in experimental pulmonary hypertension in dogs [55] and anti-influenza virus activity in vitro through inhibition of neuraminidase (NA).[56] Besides, the role of apiin cardiovascular activity as antiarrhythmic and anti-ischemic agent has also been reported.[57] In view of our results, apiin might therefore be a strong drug candidate, as it inhibits SARS-CoV-2 virus and tackling also COVID-19 major disease symptoms. Rhoifolin (apigenin 7-O-neohesperidoside), is a neohesperidoside, a dihydroxyflavone and a glycosyloxyflavone, was first isolated from plant
Rhus succedanea (Sumac or wax tree, originating from Asia, but also found in Australia and New Zealand).[58] Resently, rhoifolin was found to efficiently block the enzymatic activity of SARS-CoV 3CL-Pro,[59] with a methodology similar to that used in the present study. In addition, rhoifollin was reported to inhibit CVB3 infection, a primary cause of viral myocarditis in humans. In addition, it was found to decrease inflammation, by significantly decreasing prostaglandin E2 and the release of pro-inflammatory cytokines (TNF-α, IL-1β, and IL-6).[60,61] Rhoifolin isolated from
Citrus grandis leaves was beneficial in metabolic diseases, including type II diabetes, by enhancing adiponectin secretion, tyrosine phosphorylation of insulin receptor-β and GLUT4 translocation.[62] Rhoifolin also caused a decrease of mean aortic pressure, of the arterial and pulmonary capillary pressure and of heart rate in the dog.[63] Moreover, previous study has emonstrated that rhoifolin can have an inhibitory effect on angiotensin-converting enzyme (ACE) activity, which plays a key role in the regulation of arterial blood pressure.[63]
Supplemental Table S1
PDB codes of 3CL-PRO crystals, deposited in the PDB databank, accessed at July 3, 2020. upplemental Table S2
Table presents the global and local RMSD values, at the dimerization domain) of 3CL-Pro monomer, at different time-frames, as determined by molecular dynamics. The interaction of fortunellin in each pose is also shown, as changes of the Gibbs free energy changes (ΔG). Model Time (μs) RMSD Total RMSD Local ΔG (kcal/mol) 0 0 0.000 0.000 -13.936 1 3 1.637 1.235 -11.199 2 6 2.248 3.290 -12.144 3 9 1.954 1.515 -10.704 4 12 1.421 1.292 -14.213 5 15 1.756 1.085 -11.611 6 18 1.706 0.998 -11.987 7 21 2.292 2.608 -14.658 8 24 2.694 2.757 -13.699 9 27 1.845 1.230 -13.563 10 30 1.821 1.268 -11.468 11 33 2.031 1.344 -12.760 12 36 1.924 1.604 -12.518 13 39 1.888 1.269 -11.579 14 42 2.544 2.686 -12.165 15 45 1.897 1.463 -11.507 16 48 2.122 1.883 -12.423 7 51 1.965 1.420 -11.478 18 54 1.924 1.534 -11.997 19 57 1.915 1.263 -12.070 20 60 1.874 1.385 -12.660 upplemental Table S3
ZINC number, formula, SMILES and binding affinity for 3CL-Pro (as expressed by Gibb’s free energy, ΔG, change in kcal/mol) of ther sixteen (16) compounds from the ZINC database for natural products.
Compound
2D Structure
Smiles ΔG (Kcal/mol)
ZINC4349204 (Fortunellin) COc1ccc(-c2cc(=O)c3c(O)cc(O[C@@H]4O[C@@H](CO)[C@@H](O)[C@@H](O)[C@H]4O[C@@H]4O[C@H](C)[C@H](O)[C@H](O)[C@H]4O)cc3o2)cc1 -13.936 ZINC4349029 O=c1cc(-c2ccc(O)cc2)oc2cc(O[C@@H]3O[C@@H](CO)[C@@H](O)[C@H](O)[C@H]3O[C@@H]3OC[C@@](O)(CO)[C@H]3O)cc(O)c12 -14.241 ZINC4349031 O=c1cc(-c2ccc(O)cc2)oc2cc(O[C@@H]3O[C@H](CO)[C@@H](O)[C@@H](O)[C@H]3O[C@@H]3OC[C@@](O)(CO)[C@H]3O)cc(O)c12 -12.352 INC4349034 O=c1cc(-c2ccc(O)cc2)oc2cc(O[C@@H]3O[C@H](CO)[C@@H](O)[C@H](O)[C@H]3O[C@@H]3OC[C@@](O)(CO)[C@H]3O)cc(O)c12 -12.451 ZINC3983878 (Apiin) O=c1cc(-c2ccc(O)cc2)oc2cc(O[C@@H]3O[C@H](CO)[C@@H](O)[C@H](O)[C@H]3O[C@@H]3OC[C@](O)(CO)[C@H]3O)cc(O)c12 -12.604 ZINC4349207 COc1ccc(-c2cc(=O)c3c(O)cc(O[C@@H]4O[C@@H](CO)[C@@H](O)[C@@H](O)[C@H]4O[C@@H]4O[C@@H](C)[C@H](O)[C@H](O)[C@H]4O)cc3o2)cc1 -11.539 ZINC4349211 COc1ccc(-c2cc(=O)c3c(O)cc(O[C@@H]4O[C@@H](CO)[C@@H](O)[C@@H](O)[C@H]4O[C@@H]4O[C@H](C)[C@H](O)[C@@H](O)[C@H]4O)cc3o2)cc1 -11.545 INC4349214 COc1ccc(-c2cc(=O)c3c(O)cc(O[C@@H]4O[C@@H](CO)[C@@H](O)[C@@H](O)[C@H]4O[C@@H]4O[C@@H](C)[C@H](O)[C@@H](O)[C@H]4O)cc3o2)cc1 -12.150 ZINC4349623 C[C@H]1O[C@@H](O[C@H]2[C@H](Oc3cc(O)c4c(=O)cc(-c5ccc(O)cc5)oc4c3)O[C@@H](CO)[C@@H](O)[C@H]2O)[C@H](O)[C@@H](O)[C@H]1O -12.970 ZINC4349627 C[C@@H]1O[C@@H](O[C@H]2[C@H](Oc3cc(O)c4c(=O)cc(-c5ccc(O)cc5)oc4c3)O[C@@H](CO)[C@@H](O)[C@H]2O)[C@H](O)[C@@H](O)[C@H]1O -13.200 ZINC4349630 C[C@H]1O[C@@H](O[C@H]2[C@H](Oc3cc(O)c4c(=O)cc(-c5ccc(O)cc5)oc4c3)O[C@@H](CO)[C@@H] -12.229 O)[C@H]2O)[C@H](O)[C@H](O)[C@H]1O ZINC4349633 C[C@@H]1O[C@@H](O[C@H]2[C@H](Oc3cc(O)c4c(=O)cc(-c5ccc(O)cc5)oc4c3)O[C@@H](CO)[C@@H](O)[C@H]2O)[C@H](O)[C@H](O)[C@H]1O -12.937 ZINC4534057 C[C@H]1O[C@@H](O[C@H]2[C@@H](Oc3cc(O)c4c(=O)cc(-c5ccc(O)cc5)oc4c3)O[C@@H](CO)[C@@H](O)[C@H]2O)[C@@H](O)[C@@H](O)[C@H]1O -12.238 ZINC4534058 C[C@@H]1O[C@@H](O[C@H]2[C@@H](Oc3cc(O)c4c(=O)cc(-c5ccc(O)cc5)oc4c3)O[C@@H](CO)[C@@H](O)[C@H]2O)[C@@H](O)[C@@H](O)[C@H]1O -13.697 ZINC4534059 C[C@H]1O[C@@H](O[C@H]2[C@@H](Oc3cc(O)c4c(=O)cc(-c5ccc(O)cc5)oc4c3)O[C@@H](CO)[C@@H] -14.387 O)[C@H]2O)[C@@H](O)[C@@H](O)[C@@H]1O ZINC4534060 C[C@@H]1O[C@@H](O[C@H]2[C@@H](Oc3cc(O)c4c(=O)cc(-c5ccc(O)cc5)oc4c3)O[C@@H](CO)[C@@H](O)[C@H]2O)[C@@H](O)[C@@H](O)[C@@H]1O -11.160 ZINC3978800 (Rhoifolin) C[C@@H]1O[C@@H](O[C@H]2[C@H](Oc3cc(O)c4c(=O)cc(-c5ccc(O)cc5)oc4c3)O[C@H](CO)[C@@H](O)[C@@H]2O)[C@H](O)[C@H](O)[C@H]1O -11.961
Supplemental Table S4
ADME characteristics of 16 compounds from the ZINC database for natural products, evaluated with the SwissADME resource. [64]
See SwissADME.xlsx file upplemental Figure S1
Superposition of 10 different 3CL-PRO models, from the PDB database. RMSDs were calculated with the Chimera program.[14] upplemental Figure S2
Comparison of the crystal structure (pdb 6YB7) with the two simulated solutions obtained with the GalaxyWeb (routine homomer) [5] and the HEX program.[20,21] The distinct colors in each Figure shows the corresponding monomers. eferences
1. Arnold, K.; Bordoli, L.; Kopp, J.; Schwede, T. The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling.
Bioinformatics , , 195–201, doi:10.1093/bioinformatics/bti770. 2. Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The Protein Data Bank. Nucleic Acids Res , , 235–242. 3. Sterling, T.; Irwin, J.J. ZINC 15--Ligand Discovery for Everyone. J Chem Inf Model , , 2324–2337, doi:10.1021/acs.jcim.5b00559. 4. O’Boyle, N.M.; Banck, M.; James, C.A.; Morley, C.; Vandermeersch, T.; Hutchison, G.R. Open Babel: An open chemical toolbox. J Cheminform , , 33, doi:10.1186/1758-2946-3-33. 5. Baek, M.; Park, T.; Heo, L.; Park, C.; Seok, C. GalaxyHomomer: a web server for protein homo-oligomer structure prediction from a monomer sequence or structure. Nucleic Acids Res , , W320–W324, doi:10.1093/nar/gkx246. 6. Heo, L.; Park, H.; Seok, C. GalaxyRefine: Protein structure refinement driven by side-chain repacking. Nucleic Acids Res , , W384-8, doi:10.1093/nar/gkt458. 7. Heo, L.; Shin, W.H.; Lee, M.S.; Seok, C. GalaxySite: ligand-binding-site prediction by using molecular docking. Nucleic Acids Res , , W210-4, doi:10.1093/nar/gku321. 8. Ko, J.; Park, H.; Heo, L.; Seok, C. GalaxyWEB server for protein structure prediction and refinement. Nucleic Acids Res. , , W294–W297, doi:10.1093/nar/gks493. 9. Lee, G.R.; Seok, C. Galaxy7TM: flexible GPCR-ligand docking by structure refinement. Nucleic Acids Res , , W502-6, doi:10.1093/nar/gkw360. 10. Shin, W.H.; Lee, G.R.; Heo, L.; Lee, H.; Seok, C. Prediction of Protein Structure and Interaction by GALAXY Protein Modeling Programs. Bio Des. l Vol.2 l No.1 l March 31, 2014 , , 1–11. 11. Shin, W.H.; Kim, J.K.; Kim, D.S.; Seok, C. GalaxyDock2: protein-ligand docking using beta-complex and global optimization. J Comput Chem , , 2647–2656, doi:10.1002/jcc.23438. 12. Lee, G.R.; Won, J.; Heo, L.; Seok, C. GalaxyRefine2: simultaneous refinement of inaccurate local regions and overall protein structure. Nucleic Acids Res , , W451–W455, doi:10.1093/nar/gkz288. 13. Lee, G.R.; Heo, L.; Seok, C. Effective protein model structure refinement by loop modeling and overall relaxation. Proteins ,
84 Suppl 1 , 293–301, doi:10.1002/prot.24858. 14. Pettersen, E.F.; Goddard, T.D.; Huang, C.C.; Couch, G.S.; Greenblatt, D.M.; Meng, E.C.; Ferrin, T.E. UCSF Chimera--a visualization system for exploratory research and analysis.
J Comput Chem , , 1605–1612, doi:10.1002/jcc.20084. 15. Zhang, L.; Lin, D.; Sun, X.; Curth, U.; Drosten, C.; Sauerhering, L.; Becker, S.; Rox, K.; Hilgenfeld, R. Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors. Science (80-. ). , eabb3405, doi:10.1126/science.abb3405. 16. Barrila, J.; Bacha, U.; Freire, E. Long-range cooperative interactions modulate dimerization in SARS 3CLpro.
Biochemistry , , 14908–14916, doi:10.1021/bi0616302. 17. Barrila, J.; Gabelli, S.B.; Bacha, U.; Amzel, L.M.; Freire, E. Mutation of Asn28 disrupts the dimerization and enzymatic activity of SARS 3CL(pro). Biochemistry , , 4308–4317, doi:10.1021/bi1002585. 8. Hu, T.; Zhang, Y.; Li, L.; Wang, K.; Chen, S.; Chen, J.; Ding, J.; Jiang, H.; Shen, X. Two adjacent mutations on the dimer interface of SARS coronavirus 3C-like protease cause different conformational changes in crystal structure. Virology , , 324–334, doi:https://doi.org/10.1016/j.virol.2009.03.034. 19. Chen, S.; Zhang, J.; Hu, T.; Chen, K.; Jiang, H.; Shen, X. Residues on the Dimer Interface of SARS Coronavirus 3C-like Protease: Dimer Stability Characterization and Enzyme Catalytic Activity Analysis. J. Biochem. , , 525–536, doi:10.1093/jb/mvm246. 20. Macindoe, G.; Mavridis, L.; Venkatraman, V.; Devignes, M.D.; Ritchie, D.W. HexServer: an FFT-based protein docking server powered by graphics processors. Nucleic Acids Res , , W445-9, doi:10.1093/nar/gkq311. 21. Ritchie, D.W. Evaluation of protein docking predictions using Hex 3.1 in CAPRI rounds 1 and 2. Proteins , , 98–106, doi:10.1002/prot.10379. 22. Jin, Z.; Du, X.; Xu, Y.; Deng, Y.; Liu, M.; Zhao, Y.; Zhang, B.; Li, X.; Zhang, L.; Peng, C.; et al. Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors. Nature , , 289–293, doi:10.1038/s41586-020-2223-y. 23. Chang, G.-G. Quaternary Structure of the SARS Coronavirus Main Protease. Mol. Biol. SARS-Coronavirus
Nucleic Acids Res. , , W665–W667. 25. Mark, P.; Nilsson, L. Structure and dynamics of the TIP3P, SPC, and SPC/E water models at 298 K. J. Phys. Chem. A , , 9954–9960. 26. Duan, Y.; Wu, C.; Chowdhury, S.; Lee, M.C.; Xiong, G.; Zhang, W.; Yang, R.; Cieplak, P.; Luo, R.; Lee, T.; et al. A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations. J Comput Chem , , 1999–2012, doi:10.1002/jcc.10349. 27. Sousa da Silva, A.W.; Vranken, W.F. ACPYPE - AnteChamber PYthon Parser interfacE. BMC Res. Notes , , 367, doi:10.1186/1756-0500-5-367. 28. Petratos, K.; Gessmann, R.; Daskalakis, V.; Papadovasilaki, M.; Papanikolau, Y.; Tsigos, I.; Bouriotis, V. Structure and Dynamics of a Thermostable Alcohol Dehydrogenase from the Antarctic Psychrophile Moraxella sp. TAE123. ACS Omega , doi:10.1021/acsomega.0c01210. 29. Berendsen, H.J.C.; Postma, J.P.M.; van Gunsteren, W.F.; DiNola, A.; Haak, J.R. Molecular dynamics with coupling to an external bath.
J. Chem. Phys. , , 3684–3690. 30. Bussi, G.; Donadio, D.; Parrinello, M. Canonical sampling through velocity rescaling. J. Chem. Phys. , , 14101. 31. Berendsen, H.J.C.; van der Spoel, D.; van Drunen, R. GROMACS: A message-passing parallel molecular dynamics implementation. Comput. Phys. Commun. , , 43–56. 32. Parrinello, M.; Rahman, A. Polymorphic transitions in single crystals: A new molecular dynamics method. J. Appl. Phys. , , 7182–7190. 33. Darden, T.; York, D.; Pedersen, L. Particle mesh Ewald: An N ⋅ log (N) method for Ewald sums in large systems. J. Chem. Phys. , , 10089–10092. 4. Yeh, I.-C.; Berkowitz, M.L. Ewald summation for systems with slab geometry. J. Chem. Phys. , , 3155–3162. 35. Hess, B.; Bekker, H.; Berendsen, H.J.C.; Fraaije, J.G.E.M. LINCS: a linear constraint solver for molecular simulations. J. Comput. Chem. , , 1463–1472. 36. Pande, V.S.; Beauchamp, K.; Bowman, G.R. Everything you wanted to know about Markov State Models but were afraid to ask. Methods , , 99–105, doi:https://doi.org/10.1016/j.ymeth.2010.06.002. 37. Prinz, J.-H.; Wu, H.; Sarich, M.; Keller, B.; Senne, M.; Held, M.; Chodera, J.D.; Schütte, C.; Noé, F. Markov models of molecular kinetics: Generation and validation. J. Chem. Phys. , , 174105, doi:10.1063/1.3565032. 38. Chodera, J.D.; Noé, F. Markov state models of biomolecular conformational dynamics. Curr. Opin. Struct. Biol. , , 135–144, doi:https://doi.org/10.1016/j.sbi.2014.04.002. 39. Plattner, N.; Noé, F. Protein conformational plasticity and complex ligand-binding kinetics explored by atomistic simulations and Markov models. Nat. Commun. , , 7653, doi:10.1038/ncomms8653. 40. Plattner, N.; Doerr, S.; De Fabritiis, G.; Noé, F. Complete protein–protein association kinetics in atomic detail revealed by molecular dynamics simulations and Markov modelling. Nat Chem , advance on J. Am. Chem. Soc. , , 1526–1528, doi:10.1021/ja9090353. 42. Durrant, J.D.; Kochanek, S.E.; Casalino, L.; Ieong, P.U.; Dommer, A.C.; Amaro, R.E. Mesoscale All-Atom Influenza Virus Simulations Suggest New Substrate Binding Mechanism. ACS Cent. Sci. , , 189–196, doi:10.1021/acscentsci.9b01071. 43. Scherer, M.K.; Trendelkamp-Schroer, B.; Paul, F.; Pérez-Hernández, G.; Hoffmann, M.; Plattner, N.; Wehmeyer, C.; Prinz, J.-H.; Noé, F. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J. Chem. Theory Comput. , , 5525–5542, doi:10.1021/acs.jctc.5b00743. 44. Noé, F. Probability distributions of molecular observables computed from Markov models. J. Chem. Phys. , , 244103, doi:10.1063/1.2916718. 45. Trendelkamp-Schroer, B.; Wu, H.; Paul, F.; Noé, F. Estimation and uncertainty of reversible Markov models. J. Chem. Phys. , , 174101, doi:10.1063/1.4934536. 46. Wu, H.; Noé, F. Variational Approach for Learning Markov Processes from Time Series Data. J. Nonlinear Sci. , , 23–66, doi:10.1007/s00332-019-09567-y. 47. Sutto, L.; Marsili, S.; Gervasio, F.L. New advances in metadynamics. Wiley Interdiscip. Rev. Comput. Mol. Sci. , , 771–779. 48. Barducci, A.; Bussi, G.; Parrinello, M. Well-tempered metadynamics: A smoothly converging and tunable free-energy method. Phys. Rev. Lett. , , 20603. 49. Bonomi, M.; Parrinello, M. Enhanced sampling in the well-tempered ensemble. Phys. Rev. Lett. , , 190601. 0. Bussi, G.; Gervasio, F.L.; Laio, A.; Parrinello, M. Free-Energy Landscape for β Hairpin Folding from Combined Parallel Tempering and Metadynamics. J. Am. Chem. Soc. , , 13435–13441, doi:10.1021/ja062463w. 51. Tribello, G.A.; Bonomi, M.; Branduardi, D.; Camilloni, C.; Bussi, G. PLUMED 2: New feathers for an old bird. Comput. Phys. Commun. , , 604–613. 52. Mencherini, T.; Cau, A.; Bianco, G.; Della Loggia, R.; Aquino, R.P.; Autore, G. An extract of Apium graveolens var. dulce leaves: structure of the major constituent, apiin, and its anti-inflammatory properties. J Pharm Pharmacol , , 891–897, doi:10.1211/jpp.59.6.0016. 53. Li, P.; Jia, J.; Zhang, D.; Xie, J.; Xu, X.; Wei, D. In vitro and in vivo antioxidant activities of a flavonoid isolated from celery (Apium graveolens L. var. dulce). Food Funct , , 50–56, doi:10.1039/c3fo60273g. 54. Mikhaeil, B.R.; Badria, F.A.; Maatooq, G.T.; Amer, M.M. Antioxidant and immunomodulatory constituents of henna leaves. Z Naturforsch C J Biosci , , 468–476, doi:10.1515/znc-2004-7-803. 55. Occhiuto, F.; Limardi, F. Comparative effects of the flavonoids luteolin, apiin and rhoifolin on experimental pulmonary hypertension in the dog. Phytother Res 8(3), 153-156 , , 153–156, doi:10.1002/ptr.2650080307. 56. Liu, A.L.; Liu, B.; Qin, H.L.; Lee, S.M.; Wang, Y.T.; Du, G.H. Anti-influenza virus activities of flavonoids from the medicinal plant Elsholtzia rugulosa. Planta Med , , 847–851, doi:10.1055/s-2008-1074558. 57. Occhiuto, F.; Busa, G.; Ragusa, S.; De Pasquale, A. Comparative antiarrhythmic and anti-ischaemic activity of some flavones in the guinea-pig and rat. . Phytother Res , , 9–14. 58. Hattori, S.; Matsuda, H. Rhoifolin, a new flavone glycoside, isolated from the leaves of Rhus succedanea. . Arch Biochem Biophys , , 85–89. 59. Jo, S.; Kim, S.; Shin, D.H.; Kim, M.S. Inhibition of SARS-CoV 3CL protease by flavonoids. J Enzym. Inhib Med Chem , , 145–151, doi:10.1080/14756366.2019.1690480. 60. Eldahshan, O.A.; SS, A. Anti-inflammatory effect of apigenin 7-neohesperidoside (rhoifolin) in carrageenin-induced rat oedema model. . J Appl Pharm Sci , , 74–79. 61. Fang, J.; Cao, Z.; Song, X.; Zhang, X.; Mai, B.; Wen, T.; Lin, J.; Chen, J.; Chi, Y.; Su, T.; et al. Rhoifolin Alleviates Inflammation of Acute Inflammation Animal Models and LPS-Induced RAW264.7 Cells via IKKbeta/NF-kappaB Signaling Pathway. Inflammation , doi:10.1007/s10753-020-01286-x. 62. Rao, Y.K.; Lee, M.J.; Chen, K.; Lee, Y.C.; Wu, W.S.; Tzeng, Y.M. Insulin-Mimetic Action of Rhoifolin and Cosmosiin Isolated from Citrus grandis (L.) Osbeck Leaves: Enhanced Adiponectin Secretion and Insulin Receptor Phosphorylation in 3T3-L1 Cells.
Evid Based Complement Altern. Med , , 624375, doi:10.1093/ecam/nep204. 63. Occhiuto, F.; Circosta, C.; De Pasquale, A.; Briguglio, F. Comparative haemodynamic effects of the flavonoids rhoifolin and vitexin in the dog. . Phytother Res , , 118–120. 64. Daina, A.; Michielin, O.; Zoete, V. SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci Rep , , 42717, doi:10.1038/srep42717. olecule Smiles Canonical SMILES Formula MW atoms heavy atoms Fraction
Csp3 bonds ‐ bond acceptors ‐ bond donors MR TPSAZINC4349204 (Fortunellin) C[C@@H]1[C@@H]([C@@H]([C@HOC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)OC)[C@@H]([C@@H]([C@@H]1O)O)O C28H32O14 592.55 42 16 0.46 7 14 7 141.80 217.97ZINC4349029 c1cc(ccc1c2cc(=O)c3c(cc(cc3o2)O[C@OC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@H]([C@@H]1O)O)O[C@C26H28O14 564.49 40 16 0.42 7 14 8 132.56 228.97ZINC4349031 c1cc(ccc1c2cc(=O)c3c(cc(cc3o2)O[C@OC[C@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@@H]([C@@H]1O)O)O[C@C26H28O14 564.49 40 16 0.42 7 14 8 132.56 228.97ZINC4349034 c1cc(ccc1c2cc(=O)c3c(cc(cc3o2)O[C@OC[C@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@H]([C@@H]1O)O)O[C@@C26H28O14 564.49 40 16 0.42 7 14 8 132.56 228.97ZINC3983878 c1cc(ccc1c2cc(c ‐ ‐ ])c2ccc(cc2)O)[C@@H]([C@H]([C@@H]1O)O)O[C@C26H27O14 563.48 40 16 0.42 7 14 7 129.79 231.80ZINC4349207 C[C@H]1[C@@H]([C@@H]([C@H]([OC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)OC)[C@@H]([C@@H]([C@@H]1O)O)O C28H32O14 592.55 42 16 0.46 7 14 7 141.80 217.97ZINC4349211 C[C@@H]1[C@@H]([C@H]([C@H]([OC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)OC)[C@@H]([C@@H]([C@@H]1O)O)O C28H32O14 592.55 42 16 0.46 7 14 7 141.80 217.97ZINC4349214 C[C@H]1[C@@H]([C@H]([C@H]([C@OC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)OC)[C@@H]([C@@H]([C@@H]1O)O)O C28H32O14 592.55 42 16 0.46 7 14 7 141.80 217.97ZINC4349623 C[C@@H]1[C@@H]([C@@H]([C@HOC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@@H]([C@@H]1O)O)O[CC27H30O14 578.52 41 16 0.44 6 14 8 137.33 228.97ZINC4349627 C[C@H]1[C@@H]([C@@H]([C@H]([OC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@@H]([C@@H]1O)O)O[CC27H30O14 578.52 41 16 0.44 6 14 8 137.33 228.97ZINC4349630 C[C@@H]1[C@@H]([C@H]([C@H]([OC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@@H]([C@@H]1O)O)O[CC27H30O14 578.52 41 16 0.44 6 14 8 137.33 228.97ZINC4349633 C[C@H]1[C@@H]([C@H]([C@H]([C@OC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@@H]([C@@H]1O)O)O[CC27H30O14 578.52 41 16 0.44 6 14 8 137.33 228.97ZINC4534057 C[C@@H]1[C@@H]([C@@H]([C@@OC[C@@H]1O[C@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@@H]([C@@H]1O)O)O[C@C27H30O14 578.52 41 16 0.44 6 14 8 137.33 228.97ZINC4534058 C[C@H]1[C@@H]([C@@H]([C@@HOC[C@@H]1O[C@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@@H]([C@@H]1O)O)O[C@C27H30O14 578.52 41 16 0.44 6 14 8 137.33 228.97ZINC4534059 C[C@@H]1[C@H]([C@@H]([C@@HOC[C@@H]1O[C@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@@H]([C@@H]1O)O)O[C@C27H30O14 578.52 41 16 0.44 6 14 8 137.33 228.97ZINC4534060 C[C@H]1[C@H]([C@@H]([C@@H]([OC[C@@H]1O[C@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@@H]([C@@H]1O)O)O[C@C27H30O14 578.52 41 16 0.44 6 14 8 137.33 228.97ZINC3978800 C[C@H]1[C@@H]([C@H]([C@H]([C@OC[C@H]1O[C@@H](Oc2cc(O)c3c(c2)oc(cc3=O)c2ccc(cc2)O)[C@@H]([C@H]([C@@H]1O)O)O[C@@C27H30O14 578.52 41 16 0.44 6 14 8 137.33 228.97 Fraction
Csp3
The ratio of sp3 hybridized carbons over the total carbon count of the molecule MR Molecular refractivity
TPSA
Topological polar surface area iLOGP
Efficient
Description of n ‐ Octanol/Water
Partition
Coefficient
XLOGP3
XLOGP3 predicts the logP value of a query compound by using the known logP value of a reference compound as a starting point. WLOGP
Lipophilicity factor developed by Wildman and
Crippen
MLOGP
Octanol/water partition coefficient developed by Moriguchi and
Matsushita
Silicos ‐ IT Log P SILICOS ‐ IT is the log Po/w estimation returned by executing the FILTER ‐ IT program Consensus
Log P Consensus log
Po/w value is the arithmetic mean of the five predictive values above ESOL
Log S Solubility, log S (calculated with the ESOL model) ‐ Estimating
Aqueous
Solubility
Directly from
Molecular
Structure
ESOL
Solubility (mg/ml)
Estimating
Aqueous
Solubility
Directly from
Molecular
Structure in mg/ml ESOL
Solubility (mol/l)
Estimating
Aqueous
Solubility
Directly from
Molecular
Structure in mol/l ESOL
Class
Classification of the compound based on estimating Aqueous
Solubility
Directly from
Molecular
Structure
Ali
Log S In silico prediction of aqueous solubility incorporating the effect of topographical polar surface area by Ali et all. Ali
Solubility (mg/ml)
Aqueous
Solubility developed by Ali in mg/ml Ali
Solubility (mol/l)
Aqueous
Solubility developed by Ali in mol/l Ali
Class
Classification of the compound based on estimating Aqueous
Solubility by Ali
Silicos ‐ IT LogSw
Intrinsic water solubility estimated by Wskowwin executing the
FILTER ‐ IT program Silicos ‐ IT Solubility (mg/ml)
Intrinsic water solubility in mg/ml Silicos ‐ IT Solubility (mol/l)
Intrinsic water solubility in mol/l Silicos ‐ IT class Classification of the compound based on the water solubility executing the FILTER ‐ IT program log Kp (cm/s) Value of the skin permeability coefficient SwissADME.xlsx
LOGP XLOGP3 WLOGP MLOGP Silicos ‐ IT Log
P Consensus
Log
P ESOL
Log
S ESOL
Solubility (mg/ml) ESOL
Solubility (mol/l) ESOL
Class Ali
Log
S Ali
Solubility (mg/ml) Ali
Solubility (mol/l) Ali
Class Silicos ‐ IT LogSw Silicos ‐ IT Solubility (mg/ml) Silicos ‐ IT Solubility (mol/l) Silicos ‐ IT class ‐ ‐ ‐ ‐ ‐
02 1.01e ‐
04 Soluble ‐ ‐
03 6.05e ‐
06 Moderately soluble ‐ ‐
03 Soluble1.85 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 1.13e ‐
03 Soluble ‐ ‐
02 1.03e ‐
04 Soluble ‐ ‐
02 Soluble2.11 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 1.13e ‐
03 Soluble ‐ ‐
02 1.03e ‐
04 Soluble ‐ ‐
02 Soluble2.16 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 1.13e ‐
03 Soluble ‐ ‐
02 1.03e ‐
04 Soluble ‐ ‐
02 Soluble2.92 ‐ ‐ ‐ ‐ ‐ ‐ ‐
03 Soluble ‐ ‐
01 4.57e ‐
04 Soluble ‐ ‐
02 Soluble3.09 1.05 ‐ ‐ ‐ ‐ ‐ ‐
02 1.01e ‐
04 Soluble ‐ ‐
03 6.05e ‐
06 Moderately soluble ‐ ‐
03 Soluble2.94 1.05 ‐ ‐ ‐ ‐ ‐ ‐
02 1.01e ‐
04 Soluble ‐ ‐
03 6.05e ‐
06 Moderately soluble ‐ ‐
03 Soluble2.74 1.05 ‐ ‐ ‐ ‐ ‐ ‐
02 1.01e ‐
04 Soluble ‐ ‐
03 6.05e ‐
06 Moderately soluble ‐ ‐
03 Soluble2.45 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 6.04e ‐
04 Soluble ‐ ‐
02 6.40e ‐
05 Moderately soluble ‐ ‐
02 Soluble2.03 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 6.04e ‐
04 Soluble ‐ ‐
02 6.40e ‐
05 Moderately soluble ‐ ‐
02 Soluble2.61 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 6.04e ‐
04 Soluble ‐ ‐
02 6.40e ‐
05 Moderately soluble ‐ ‐
02 Soluble2.60 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 6.04e ‐
04 Soluble ‐ ‐
02 6.40e ‐
05 Moderately soluble ‐ ‐
02 Soluble2.45 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 6.04e ‐
04 Soluble ‐ ‐
02 6.40e ‐
05 Moderately soluble ‐ ‐
02 Soluble2.63 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 6.04e ‐
04 Soluble ‐ ‐
02 6.40e ‐
05 Moderately soluble ‐ ‐
02 Soluble2.57 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 6.04e ‐
04 Soluble ‐ ‐
02 6.40e ‐
05 Moderately soluble ‐ ‐
02 Soluble2.20 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 6.04e ‐
04 Soluble ‐ ‐
02 6.40e ‐
05 Moderately soluble ‐ ‐
02 Soluble1.31 ‐ ‐ ‐ ‐ ‐ ‐ ‐
01 6.04e ‐
04 Soluble ‐ ‐
02 6.40e ‐
05 Moderately soluble ‐ ‐
02 Soluble I absorption BBB permeant Pgp substrate CYP1A2 inhibitor CYP2C19 inhibitor CYP2C9 inhibitor CYP2D6 inhibitor CYP3A4 inhibitor log Kp (cm/s) Lipinski Score PAINS