Faruck Morcos
University of Texas at Dallas
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Faruck Morcos.
Proceedings of the National Academy of Sciences of the United States of America | 2016
Fang Bai; Faruck Morcos; Ryan R. Cheng; Hualiang Jiang; José N. Onuchic
Significance Protein−protein interfaces have become an emerging class of molecular targets for the design of therapeutic drugs. However, major challenges exist for the correct identification of binding sites on the protein surface as well as drug-like modulators of protein−protein interaction. An integrated approach using molecular fragment docking and coevolutionary analysis is presented to face these challenges. This approach can accurately predict and characterize the binding sites for protein−protein interactions as well as provide clusters of bound, fragment-sized molecules on the druggable regions of the predicted binding site. These bound, molecular fragments can be chemically combined to create candidate drugs. Protein−protein interactions play a central role in cellular function. Improving the understanding of complex formation has many practical applications, including the rational design of new therapeutic agents and the mechanisms governing signal transduction networks. The generally large, flat, and relatively featureless binding sites of protein complexes pose many challenges for drug design. Fragment docking and direct coupling analysis are used in an integrated computational method to estimate druggable protein−protein interfaces. (i) This method explores the binding of fragment-sized molecular probes on the protein surface using a molecular docking-based screen. (ii) The energetically favorable binding sites of the probes, called hot spots, are spatially clustered to map out candidate binding sites on the protein surface. (iii) A coevolution-based interface interaction score is used to discriminate between different candidate binding sites, yielding potential interfacial targets for therapeutic drug design. This approach is validated for important, well-studied disease-related proteins with known pharmaceutical targets, and also identifies targets that have yet to be studied. Moreover, therapeutic agents are proposed by chemically connecting the fragments that are strongly bound to the hot spots.
Structure | 2016
Alessandro Pandini; Faruck Morcos; Shahid Khan
Summary Switching of flagellar motor rotation sense dictates bacterial chemotaxis. Multi-subunit FliM-FliG rotor rings couple signal protein binding in FliM with reversal of a distant FliG C-terminal (FliGC) helix involved in stator contacts. Subunit dynamics were examined in conformer ensembles generated by molecular simulations from the X-ray structures. Principal component analysis extracted collective motions. Interfacial loop immobilization by complex formation coupled elastic fluctuations of the FliM middle (FliMM) and FliG middle (FliGM) domains. Coevolved mutations captured interfacial dynamics as well as contacts. FliGM rotation was amplified via two central hinges to the FliGC helix. Intrinsic flexibility, reported by the FliGMC ensembles, reconciled conformers with opposite FliGC helix orientations. FliG domain stacking deformed the inter-domain linker and reduced flexibility; but conformational changes were not triggered by engineered linker deletions that cause a rotation-locked phenotype. These facts suggest that binary rotation states arise from conformational selection by stacking interactions.
Molecular Biology and Evolution | 2016
Ryan R. Cheng; Olle Nordesjö; Ryan L. Hayes; Herbert Levine; Samuel Coulbourn Flores; José N. Onuchic; Faruck Morcos
Two-component signaling (TCS) is the primary means by which bacteria sense and respond to the environment. TCS involves two partner proteins working in tandem, which interact to perform cellular functions whereas limiting interactions with non-partners (i.e., cross-talk). We construct a Potts model for TCS that can quantitatively predict how mutating amino acid identities affect the interaction between TCS partners and non-partners. The parameters of this model are inferred directly from protein sequence data. This approach drastically reduces the computational complexity of exploring the sequence-space of TCS proteins. As a stringent test, we compare its predictions to a recent comprehensive mutational study, which characterized the functionality of 204 mutational variants of the PhoQ kinase in Escherichia coli. We find that our best predictions accurately reproduce the amino acid combinations found in experiment, which enable functional signaling with its partner PhoP. These predictions demonstrate the evolutionary pressure to preserve the interaction between TCS partners as well as prevent unwanted cross-talk. Further, we calculate the mutational change in the binding affinity between PhoQ and PhoP, providing an estimate to the amount of destabilization needed to disrupt TCS.
Journal of Bacteriology | 2016
Joseph S. Boyd; Ryan R. Cheng; Mark L. Paddock; Cigdem Sancar; Faruck Morcos; Susan S. Golden
UNLABELLED Two-component systems (TCS) that employ histidine kinases (HK) and response regulators (RR) are critical mediators of cellular signaling in bacteria. In the model cyanobacterium Synechococcus elongatus PCC 7942, TCSs control global rhythms of transcription that reflect an integration of time information from the circadian clock with a variety of cellular and environmental inputs. The HK CikA and the SasA/RpaA TCS transduce time information from the circadian oscillator to modulate downstream cellular processes. Despite immense progress in understanding of the circadian clock itself, many of the connections between the clock and other cellular signaling systems have remained enigmatic. To narrow the search for additional TCS components that connect to the clock, we utilized direct-coupling analysis (DCA), a statistical analysis of covariant residues among related amino acid sequences, to infer coevolution of new and known clock TCS components. DCA revealed a high degree of interaction specificity between SasA and CikA with RpaA, as expected, but also with the phosphate-responsive response regulator SphR. Coevolutionary analysis also predicted strong specificity between RpaA and a previously undescribed kinase, HK0480 (herein CikB). A knockout of the gene for CikB (cikB) in a sasA cikA null background eliminated the RpaA phosphorylation and RpaA-controlled transcription that is otherwise present in that background and suppressed cell elongation, supporting the notion that CikB is an interactor with RpaA and the clock network. This study demonstrates the power of DCA to identify subnetworks and key interactions in signaling pathways and of combinatorial mutagenesis to explore the phenotypic consequences. Such a combined strategy is broadly applicable to other prokaryotic systems. IMPORTANCE Signaling networks are complex and extensive, comprising multiple integrated pathways that respond to cellular and environmental cues. A TCS interaction model, based on DCA, independently confirmed known interactions and revealed a core set of subnetworks within the larger HK-RR set. We validated high-scoring candidate proteins via combinatorial genetics, demonstrating that DCA can be utilized to reduce the search space of complex protein networks and to infer undiscovered specific interactions for signaling proteins in vivo Significantly, new interactions that link circadian response to cell division and fitness in a light/dark cycle were uncovered. The combined analysis also uncovered a more basic core clock, illustrating the synergy and applicability of a combined computational and genetic approach for investigating prokaryotic signaling networks.
F1000Research | 2016
Jeffrey K. Noel; Faruck Morcos; José N. Onuchic
Experimentally derived structural constraints have been crucial to the implementation of computational models of biomolecular dynamics. For example, not only does crystallography provide essential starting points for molecular simulations but also high-resolution structures permit for parameterization of simplified models. Since the energy landscapes for proteins and other biomolecules have been shown to be minimally frustrated and therefore funneled, these structure-based models have played a major role in understanding the mechanisms governing folding and many functions of these systems. Structural information, however, may be limited in many interesting cases. Recently, the statistical analysis of residue co-evolution in families of protein sequences has provided a complementary method of discovering residue-residue contact interactions involved in functional configurations. These functional configurations are often transient and difficult to capture experimentally. Thus, co-evolutionary information can be merged with that available for experimentally characterized low free-energy structures, in order to more fully capture the true underlying biomolecular energy landscape.
Scientific Reports | 2017
Xian Li Jiang; Emmanuel Martinez-Ledesma; Faruck Morcos
The connection between genetic variation and drug response has long been explored to facilitate the optimization and personalization of cancer therapy. Crucial to the identification of drug response related genetic features is the ability to separate indirect correlations from direct correlations across abundant datasets with large number of variables. Here we analyzed proteomic and pharmacogenomic data in cancer tissues and cell lines using a global statistical model connecting protein pairs, genes and anti-cancer drugs. We estimated this model using direct coupling analysis (DCA), a powerful statistical inference method that has been successfully applied to protein sequence data to extract evolutionary signals that provide insights on protein structure, folding and interactions. We used Direct Information (DI) as a metric of connectivity between proteins as well as gene-drug pairs. We were able to infer important interactions observed in cancer-related pathways from proteomic data and predict potential connectivities in cancer networks. We also identified known and potential connections for anti-cancer drugs and gene mutations using DI in pharmacogenomic data. Our findings suggest that gene-drug connections predicted with direct couplings can be used as a reliable guide to cancer therapy and expand our understanding of the effects of gene alterations on drug efficacies.
Royal Society Open Science | 2018
Ricardo Nascimento dos Santos; Shahid Khan; Faruck Morcos
Bacterial flagellar motility, an important virulence factor, is energized by a rotary motor localized within the flagellar basal body. The rotor module consists of a large framework (the C-ring), composed of the FliG, FliM and FliN proteins. FliN and FliM contacts the FliG torque ring to control the direction of flagellar rotation. We report that structure-based models constrained only by residue coevolution can recover the binding interface of atomic X-ray dimer complexes with remarkable accuracy (approx. 1 Å RMSD). We propose a model for FliM–FliN heterodimerization, which agrees accurately with homologous interfaces as well as in situ cross-linking experiments, and hence supports a proposed architecture for the lower portion of the C-ring. Furthermore, this approach allowed the identification of two discrete and interchangeable homodimerization interfaces between FliM middle domains that agree with experimental measurements and might be associated with C-ring directional switching dynamics triggered upon binding of CheY signal protein. Our findings provide structural details of complex formation at the C-ring that have been difficult to obtain with previous methodologies and clarify the architectural principle that underpins the ultra-sensitive allostery exhibited by this ring assembly that controls the clockwise or counterclockwise rotation of flagella.
Bioinformatics | 2018
Ricardo Nascimento dos Santos; Állan Jhonathan Ramos Ferrari; Hugo Cesar Ramos de Jesus; Fabio C. Gozzo; Faruck Morcos; Leandro Martínez
Motivation Elucidation of protein native states from amino acid sequences is a primary computational challenge. Modern computational and experimental methodologies, such as molecular coevolution and chemical cross‐linking mass‐spectrometry allowed protein structural characterization to previously intangible systems. Despite several independent successful examples, data from these distinct methodologies have not been systematically studied in conjunction. One challenge of structural inference using coevolution is that it is limited to sequence fragments within a conserved and unique domain for which sufficient sequence datasets are available. Therefore, coupling coevolutionary data with complimentary distance constraints from orthogonal sources can provide additional precision to structure prediction methodologies. Results In this work, we present a methodology to combine residue interaction data obtained from coevolutionary information and cross‐linking/mass spectrometry distance constraints in order to identify functional states of proteins. Using a combination of structure‐based models (SBMs) with optimized Gaussian‐like potentials, secondary structure estimation and simulated annealing molecular dynamics, we provide an automated methodology to integrate constraint data from diverse sources in order to elucidate the native conformation of full protein systems with distinct complexity and structural topologies. We show that cross‐linking mass spectrometry constraints improve the structure predictions obtained from SBMs and coevolution signals, and that the constraints obtained by each method have a useful degree of complementarity that promotes enhanced fold estimates. Availability and implementation Scripts and procedures to implement the methodology presented herein are available at https://github.com/mcubeg/DCAXL.
bioRxiv | 2017
Ryan R. Cheng; Ellinor Haglund; Nicholas Tiee; Faruck Morcos; Herbert Levine; Joseph A. Adams; Patricia A. Jennings; José N. Onuchic
The selection of mutations that encode new interactions between bacterial two-component signaling (TCS) proteins remains a significant challenge. Recent work constructed a coevolutionary landscape where mutations can readily be selected to maintain signal transfer interactions between partner TCS proteins without introducing unwanted crosstalk. A bigger challenge is to select mutations for a TCS protein from the landscape to enhance, suppress, or have a neutral effect on its basal signal transfer with a non-partner. This study focuses on the computational selection of 12 single-point mutations to a response regulator from Bacillus subtilis and its effect on phosphotransfer with a histidine kinase from Escherichia Coli. These mutations are experimentally expressed to directly test the theoretical predictions, of which seven mutants successfully perturb phosphotransfer in the predicted manner. Furthermore, Differential Scanning Calorimetry is used to monitor any protein stability effects caused by the mutations, which could be detrimental to proper protein function.
Archive | 2019
Ricardo Nascimento dos Santos; Xianli Jiang; Leandro Martínez; Faruck Morcos
The analysis of coevolutionary signals from families of evolutionarily related sequences is a recent conceptual framework that provides valuable information about unique intramolecular interactions and, therefore, can assist in the elucidation of biomolecular conformations. It is based on the idea that compensatory mutations at specific residue positions in a sequence help preserve stability of protein architecture and function and leave a statistical signature related to residue-residue interactions in the 3D structure of the protein. Consequently, statistical analysis of these correlated mutations in subsets of protein sequence alignments can be used to predict which residue pairs should be in spatial proximity in the native functional protein fold. These predicted signals can be then used to guide molecular dynamics (MD) simulations to predict the three-dimensional coordinates of a functional amino acid chain. In this chapter, we introduce a general and efficient methodology to perform coevolutionary analysis on protein sequences and to use this information in combination with computational physical models to predict the native 3D conformation of functional polypeptides. We present a step-by-step methodology that includes the description and application of software tools and databases required to infer tertiary structures of a protein fold. The general pipeline includes instructions on (1) how to obtain direct amino acid couplings from protein sequences using direct coupling analysis (DCA), (2) how to incorporate such signals as interaction potentials in Cα structure-based models (SBMs) to drive protein-folding MD simulations, (3) a procedure to estimate secondary structure and how to include such estimates in the topology files required in the MD simulations, and (4) how to build full atomic models based on the top Cα candidates selected in the pipeline. The information presented in this chapter is self-contained and sufficient to allow a computational scientist to predict structures of proteins using publicly available algorithms and databases.