Osnat Penn | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Osnat Penn is active.

Explore More

Publication

Featured researches published by Osnat Penn.

Nucleic Acids Research | 2010

GUIDANCE: a web server for assessing alignment confidence scores

Osnat Penn; Eyal Privman; Haim Ashkenazy; Giddy Landan; Dan Graur; Tal Pupko

Evaluating the accuracy of multiple sequence alignment (MSA) is critical for virtually every comparative sequence analysis that uses an MSA as input. Here we present the GUIDANCE web-server, a user-friendly, open access tool for the identification of unreliable alignment regions. The web-server accepts as input a set of unaligned sequences. The server aligns the sequences and provides a simple graphic visualization of the confidence score of each column, residue and sequence of an alignment, using a color-coding scheme. The method is generic and the user is allowed to choose the alignment algorithm (ClustalW, MAFFT and PRANK are supported) as well as any type of molecular sequences (nucleotide, protein or codon sequences). The server implements two different algorithms for evaluating confidence scores: (i) the heads-or-tails (HoT) method, which measures alignment uncertainty due to co-optimal solutions; (ii) the GUIDANCE method, which measures the robustness of the alignment to guide-tree uncertainty. The server projects the confidence scores onto the MSA and points to columns and sequences that are unreliably aligned. These can be automatically removed in preparation for downstream analyses. GUIDANCE is freely available for use at http://guidance.tau.ac.il.

Nucleic Acids Research | 2012

FastML: a web server for probabilistic reconstruction of ancestral sequences

Haim Ashkenazy; Osnat Penn; Adi Doron-Faigenboim; Ofir Cohen; Gina M. Cannarozzi; Oren Zomer; Tal Pupko

Ancestral sequence reconstruction is essential to a variety of evolutionary studies. Here, we present the FastML web server, a user-friendly tool for the reconstruction of ancestral sequences. FastML implements various novel features that differentiate it from existing tools: (i) FastML uses an indel-coding method, in which each gap, possibly spanning multiples sites, is coded as binary data. FastML then reconstructs ancestral indel states assuming a continuous time Markov process. FastML provides the most likely ancestral sequences, integrating both indels and characters; (ii) FastML accounts for uncertainty in ancestral states: it provides not only the posterior probabilities for each character and indel at each sequence position, but also a sample of ancestral sequences from this posterior distribution, and a list of the k-most likely ancestral sequences; (iii) FastML implements a large array of evolutionary models, which makes it generic and applicable for nucleotide, protein and codon sequences; and (iv) a graphical representation of the results is provided, including, for example, a graphical logo of the inferred ancestral sequences. The utility of FastML is demonstrated by reconstructing ancestral sequences of the Env protein from various HIV-1 subtypes. FastML is freely available for all academic users and is available online at http://fastml.tau.ac.il/.

Molecular Biology and Evolution | 2012

Improving the Performance of Positive Selection Inference by Filtering Unreliable Alignment Regions

Eyal Privman; Osnat Penn; Tal Pupko

Errors in the inferred multiple sequence alignment may lead to false prediction of positive selection. Recently, methods for detecting unreliable alignment regions were developed and were shown to accurately identify incorrectly aligned regions. While removing unreliable alignment regions is expected to increase the accuracy of positive selection inference, such filtering may also significantly decrease the power of the test, as positively selected regions are fast evolving, and those same regions are often those that are difficult to align. Here, we used realistic simulations that mimic sequence evolution of HIV-1 genes to test the hypothesis that the performance of positive selection inference using codon models can be improved by removing unreliable alignment regions. Our study shows that the benefit of removing unreliable regions exceeds the loss of power due to the removal of some of the true positively selected sites.

Proteins | 2007

Stepwise Prediction of Conformational Discontinuous B-Cell Epitopes Using the Mapitope Algorithm

Erez M. Bublil; Natalia T. Freund; Itay Mayrose; Osnat Penn; Anna Roitburd-Berman; Nimrod D. Rubinstein; Tal Pupko; Jonathan M. Gershoni

Mapping the epitope of an antibody is of great interest, since it contributes much to our understanding of the mechanisms of molecular recognition and provides the basis for rational vaccine design. Here we present Mapitope, a computer algorithm for epitope mapping. The algorithm input is a set of affinity isolated peptides obtained by screening phage display peptide‐libraries with the antibody of interest. The output is usually 1–3 epitope candidates on the surface of the atomic structure of the antigen. We have systematically tested the performance of Mapitope by assessing the effect of the algorithm parameters on the final prediction. Thus, we have examined the effect of the statistical threshold (ST) parameter, relating to the frequency distribution and enrichment of amino acid pairs from the isolated peptides and the D (distance) and E (exposure) parameters which relate to the physical parameters of the antigen. Two model systems were analyzed in which the antibody of interest had previously been co‐crystallized with the antigen and thus the epitope is a given. The Mapitope algorithm successfully predicted the epitopes in both models. Accordingly, we formulated a stepwise paradigm for the prediction of discontinuous conformational epitopes using peptides obtained from screening phage display libraries. We applied this paradigm to successfully predict the epitope of the Trastuzumab antibody on the surface of the Her‐2/neu receptor in a third model system. Proteins 2007.

Bioinformatics | 2007

Pepitope: Epitope mapping from affinity-selected peptides

Itay Mayrose; Osnat Penn; Elana Erez; Nimrod D. Rubinstein; Tomer Shlomi; Natalia T. Freund; Erez M. Bublil; Eytan Ruppin; Roded Sharan; Jonathan M. Gershoni; Eric Martz; Tal Pupko

Abstract Identifying the epitope to which an antibody binds is central for many immunological applications such as drug design and vaccine development. The Pepitope server is a web-based tool that aims at predicting discontinuous epitopes based on a set of peptides that were affinity-selected against a monoclonal antibody of interest. The server implements three different algorithms for epitope mapping: PepSurf, Mapitope, and a combination of the two. The rationale behind these algorithms is that the set of peptides mimics the genuine epitope in terms of physicochemical properties and spatial organization. When the three-dimensional (3D) structure of the antigen is known, the information in these peptides can be used to computationally infer the corresponding epitope. A user-friendly web interface and a graphical tool that allows viewing the predicted epitopes were developed. Pepitope can also be applied for inferring other types of protein–protein interactions beyond the immunological context, and as a general tool for aligning linear sequences to a 3D structure. Availability: http://pepitope.tau.ac.il/ Contact: [email protected]

PLOS ONE | 2012

Deep Panning: Steps towards Probing the IgOme

Arie Ryvkin; Haim Ashkenazy; Larisa Smelyanski; Gilad Kaplan; Osnat Penn; Yael Weiss-Ottolenghi; Eyal Privman; Peter B. Ngam; James E. Woodward; Gregory D. May; Callum J. Bell; Tal Pupko; Jonathan M. Gershoni

Background Polyclonal serum consists of vast collections of antibodies, products of differentiated B-cells. The spectrum of antibody specificities is dynamic and varies with age, physiology, and exposure to pathological insults. The complete repertoire of antibody specificities in blood, the IgOme, is therefore an extraordinarily rich source of information–a molecular record of previous encounters as well as a status report of current immune activity. The ability to profile antibody specificities of polyclonal serum at exceptionally high resolution has been an important and serious challenge which can now be overcome. Methodology/Principal Findings Here we illustrate the application of Deep Panning, a method that combines the flexibility of combinatorial phage display of random peptides with the power of high-throughput deep sequencing. Deep Panning is first applied to evaluate the quality and diversity of naïve random peptide libraries. The production of very large data sets, hundreds of thousands of peptides, has revealed unexpected properties of combinatorial random peptide libraries and indicates correctives to ensure the quality of the libraries generated. Next, Deep Panning is used to analyze a model monoclonal antibody in addition to allowing one to follow the dynamics of biopanning and peptide selection. Finally Deep Panning is applied to profile polyclonal sera derived from HIV infected individuals. Conclusions/Significance The ability to generate and characterize hundreds of thousands of affinity-selected peptides creates an effective means towards the interrogation of the IgOme and understanding of the humoral response to disease. Deep Panning should open the door to new possibilities for serological diagnostics, vaccine design and the discovery of the correlates of immunity to emerging infectious agents.

PLOS Computational Biology | 2008

Evolutionary Modeling of Rate Shifts Reveals Specificity Determinants in HIV-1 Subtypes

Osnat Penn; Adi Stern; Nimrod D. Rubinstein; Julien Y. Dutheil; Eran Bacharach; Nicolas Galtier; Tal Pupko

A hallmark of the human immunodeficiency virus 1 (HIV-1) is its rapid rate of evolution within and among its various subtypes. Two complementary hypotheses are suggested to explain the sequence variability among HIV-1 subtypes. The first suggests that the functional constraints at each site remain the same across all subtypes, and the differences among subtypes are a direct reflection of random substitutions, which have occurred during the time elapsed since their divergence. The alternative hypothesis suggests that the functional constraints themselves have evolved, and thus sequence differences among subtypes in some sites reflect shifts in function. To determine the contribution of each of these two alternatives to HIV-1 subtype evolution, we have developed a novel Bayesian method for testing and detecting site-specific rate shifts. The RAte Shift EstimatoR (RASER) method determines whether or not site-specific functional shifts characterize the evolution of a protein and, if so, points to the specific sites and lineages in which these shifts have most likely occurred. Applying RASER to a dataset composed of large samples of HIV-1 sequences from different group M subtypes, we reveal rampant evolutionary shifts throughout the HIV-1 proteome. Most of these rate shifts have occurred during the divergence of the major subtypes, establishing that subtype divergence occurred together with functional diversification. We report further evidence for the emergence of a new sub-subtype, characterized by abundant rate-shifting sites. When focusing on the rate-shifting sites detected, we find that many are associated with known function relating to viral life cycle and drug resistance. Finally, we discuss mechanisms of covariation of rate-shifting sites.

Systematic Biology | 2010

An Evolutionary Analysis of Lateral Gene Transfer in Thymidylate Synthase Enzymes

Adi Stern; Itay Mayrose; Osnat Penn; Shaul Shaul; Uri Gophna; Tal Pupko

Abstract Thymidylate synthases (Thy) are key enzymes in the synthesis of deoxythymidylate, 1 of the 4 building blocks of DNA. As such, they are essential for all DNA-based forms of life and therefore implicated in the hypothesized transition from RNA genomes to DNA genomes. Two evolutionally unrelated Thy enzymes, ThyA and ThyX, are known to catalyze the same biochemical reaction. Both enzymes are sporadically distributed within each of the 3 domains of life in a pattern that suggests multiple nonhomologous lateral gene transfer (LGT) events. We present a phylogenetic analysis of the evolution of the 2 enzymes, aimed at unraveling their entangled evolutionary history and tracing their origin back to early life. A novel probabilistic evolutionary model was developed, which allowed us to compute the posterior probabilities and the posterior expectation of the number of LGT events. Simulation studies were performed to validate the models ability to accurately detect LGT events, which have occurred throughout a large phylogeny. Applying the model to the Thy data revealed widespread nonhomologous LGT between and within all 3 domains of life. By reconstructing the ThyA and ThyX gene trees, the most likely donor of each LGT event was inferred. The role of viruses in LGT of Thy is finally discussed.

Protein Engineering Design & Selection | 2015

Assessing the prediction fidelity of ancestral reconstruction by a library approach

Hagit Bar-Rogovsky; Adi Stern; Osnat Penn; Iris Kobl; Tal Pupko; Dan S. Tawfik

Ancestral reconstruction is a powerful tool for studying protein evolution as well as for protein design and engineering. However, in many positions alternative predictions with relatively high marginal probabilities exist, and thus the prediction comprises an ensemble of near-ancestor sequences that relate to the historical ancestor. The ancestral phenotype should therefore be explored for the entire ensemble, rather than for the sequence comprising the most probable amino acid at all positions [the most probable ancestor (mpa)]. To this end, we constructed libraries that sample ensembles of near-ancestor sequences. Specifically, we identified positions where alternatively predicted amino acids are likely to affect the ancestors structure and/or function. Using the serum paraoxonases (PONs) enzyme family as a test case, we constructed libraries that combinatorially sample these alternatives. We next characterized these libraries, reflecting the vertebrate and mammalian PON ancestors. We found that the mpa of vertebrate PONs represented only one out of many different enzymatic phenotypes displayed by its ensemble. The mammalian ancestral library, however, exhibited a homogeneous phenotype that was well represented by the mpa. Our library design strategy that samples near-ancestor ensembles at potentially critical positions therefore provides a systematic way of examining the robustness of inferred ancestral phenotypes.

Genome Research | 2018

Transcriptional fates of human-specific segmental duplications in brain

Max Dougherty; Jason G. Underwood; Bradley J. Nelson; Elizabeth Tseng; Katherine M. Munson; Osnat Penn; Tomasz J. Nowakowski; Alex A. Pollen; Evan E. Eichler

Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth-death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits.

Explore More