Yizhi Cai
University of Edinburgh
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yizhi Cai.
Science | 2014
Narayana Annaluru; Héloïse Muller; Leslie A. Mitchell; Sivaprakash Ramalingam; Giovanni Stracquadanio; Sarah M. Richardson; Jessica S. Dymond; Zheng Kuang; Lisa Z. Scheifele; Eric M. Cooper; Yizhi Cai; Karen Zeller; Neta Agmon; Jeffrey S. Han; Michalis Hadjithomas; Jennifer Tullman; Katrina Caravelli; Kimberly Cirelli; Zheyuan Guo; Viktoriya London; Apurva Yeluru; Sindurathy Murugan; Karthikeyan Kandavelou; Nicolas Agier; Gilles Fischer; Kun Yang; J. Andrew Martin; Murat Bilgel; Pavlo Bohutski; Kristin M. Boulier
Designer Chromosome One of the ultimate aims of synthetic biology is to build designer organisms from the ground up. Rapid advances in DNA synthesis has allowed the assembly of complete bacterial genomes. Eukaryotic organisms, with their generally much larger and more complex genomes, present an additional challenge to synthetic biologists. Annaluru et al. (p. 55, published online 27 March) designed a synthetic eukaryotic chromosome based on yeast chromosome III. The designer chromosome, shorn of destabilizing transfer RNA genes and transposons, is ∼14% smaller than its wild-type template and is fully functional with every gene tagged for easy removal. A synthetic version of yeast chromosome III with every gene tagged can substitute for the original. Rapid advances in DNA synthesis techniques have made it possible to engineer viruses, biochemical pathways and assemble bacterial genomes. Here, we report the synthesis of a functional 272,871–base pair designer eukaryotic chromosome, synIII, which is based on the 316,617–base pair native Saccharomyces cerevisiae chromosome III. Changes to synIII include TAG/TAA stop-codon replacements, deletion of subtelomeric regions, introns, transfer RNAs, transposons, and silent mating loci as well as insertion of loxPsym sites to enable genome scrambling. SynIII is functional in S. cerevisiae. Scrambling of the chromosome in a heterozygous diploid reveals a large increase in a-mater derivatives resulting from loss of the MATα allele on synIII. The complete design and synthesis of synIII establishes S. cerevisiae as the basis for designer eukaryotic genome biology.
Nucleic Acids Research | 2009
Michael J. Czar; Yizhi Cai; Jean Peccoud
Chemical synthesis of custom DNA made to order calls for software streamlining the design of synthetic DNA sequences. GenoCAD™ (www.genocad.org) is a free web-based application to design protein expression vectors, artificial gene networks and other genetic constructs composed of multiple functional blocks called genetic parts. By capturing design strategies in grammatical models of DNA sequences, GenoCAD guides the user through the design process. By successively clicking on icons representing structural features or actual genetic parts, complex constructs composed of dozens of functional blocks can be designed in a matter of minutes. GenoCAD automatically derives the construct sequence from its comprehensive libraries of genetic parts. Upon completion of the design process, users can download the sequence for synthesis or further analysis. Users who elect to create a personal account on the system can customize their workspace by creating their own parts libraries, adding new parts to the libraries, or reusing designs to quickly generate sets of related constructs.
Science | 2017
Sarah M. Richardson; Leslie A. Mitchell; Giovanni Stracquadanio; Kun Yang; Jessica S. Dymond; James E. DiCarlo; Dongwon Lee; Cheng Lai Victor Huang; Srinivasan Chandrasegaran; Yizhi Cai; Jef D. Boeke; Joel S. Bader
We describe complete design of a synthetic eukaryotic genome, Sc2.0, a highly modified Saccharomyces cerevisiae genome reduced in size by nearly 8%, with 1.1 megabases of the synthetic genome deleted, inserted, or altered. Sc2.0 chromosome design was implemented with BioStudio, an open-source framework developed for eukaryotic genome design, which coordinates design modifications from nucleotide to genome scales and enforces version control to systematically track edits. To achieve complete Sc2.0 genome synthesis, individual synthetic chromosomes built by Sc2.0 Consortium teams around the world will be consolidated into a single strain by “endoreduplication intercross.” Chemically synthesized genomes like Sc2.0 are fully customizable and allow experimentalists to ask otherwise intractable questions about chromosome structure, function, and evolution with a bottom-up design strategy.
Bioinformatics | 2007
Yizhi Cai; Brian Hartnett; Claes Gustafsson; Jean Peccoud
MOTIVATION The sequence of artificial genetic constructs is composed of multiple functional fragments, or genetic parts, involved in different molecular steps of gene expression mechanisms. Biologists have deciphered structural rules that the design of genetic constructs needs to follow in order to ensure a successful completion of the gene expression process, but these rules have not been formalized, making it challenging for non-specialists to benefit from the recent progress in gene synthesis. RESULTS We show that context-free grammars (CFG) can formalize these design principles. This approach provides a path to organizing libraries of genetic parts according to their biological functions, which correspond to the syntactic categories of the CFG. It also provides a framework for the systematic design of new genetic constructs consistent with the design principles expressed in the CFG. Using parsing algorithms, this syntactic model enables the verification of existing constructs. We illustrate these possibilities by describing a CFG that generates the most common architectures of genetic constructs in Escherichia coli. AVAILABILITY A web site allows readers to experiment with the algorithms presented in this article: www.genocad.org. SUPPLEMENTARY INFORMATION Sequences and models are available at Bioinformatics online.
Science | 2016
Jef D. Boeke; George M. Church; Andrew Hessel; Nancy J. Kelley; Adam P. Arkin; Yizhi Cai; Rob Carlson; Aravinda Chakravarti; Virginia W. Cornish; Liam J. Holt; Farren J. Isaacs; Todd Kuiken; Marc J. Lajoie; Tracy Lessor; Jeantine E. Lunshof; Matthew T. Maurano; Leslie A. Mitchell; Jasper Rine; Susan J. Rosser; Neville E. Sanjana; Pamela A. Silver; David Valle; Harris H. Wang; Jeffrey C. Way; Luhan Yang
We need technology and an ethical framework for genome-scale engineering The Human Genome Project (“HGP-read”), nominally completed in 2004, aimed to sequence the human genome and to improve the technology, cost, and quality of DNA sequencing (1, 2). It was biologys first genome-scale project and at the time was considered controversial by some. Now, it is recognized as one of the great feats of exploration, one that has revolutionized science and medicine.
Nucleic Acids Research | 2010
Yizhi Cai; Mandy L. Wilson; Jean Peccoud
One of the foundations of synthetic biology is the project to develop libraries of standardized genetic parts that could be assembled quickly and cheaply into large systems. The limitations of the initial BioBrick standard have prompted the development of multiple new standards proposing different avenues to overcome these shortcomings. The lack of compatibility between standards, the compliance of parts with only some of the standards or even the type of constructs that each standard supports have significantly increased the complexity of assembling constructs from standardized parts. Here, we describe computer tools to facilitate the rigorous description of part compositions in the context of a rapidly changing landscape of physical construction methods and standards. A context-free grammar has been developed to model the structure of constructs compliant with six popular assembly standards. Its implementation in GenoCAD makes it possible for users to quickly assemble from a rich library of genetic parts, constructs compliant with any of six existing standards.
PLOS ONE | 2008
Jean Peccoud; Megan F. Blauvelt; Yizhi Cai; Kristal L. Cooper; Oswald Crasta; Emily C. DeLalla; Clive Evans; Otto Folkerts; Blair M. Lyons; Shrinivasrao P. Mane; Rebecca Shelton; Matthew A. Sweede; Sally A. Waldon
Background The design and construction of novel biological systems by combining basic building blocks represents a dominant paradigm in synthetic biology. Creating and maintaining a database of these building blocks is a way to streamline the fabrication of complex constructs. The Registry of Standard Biological Parts (Registry) is the most advanced implementation of this idea. Methods/Principal Findings By analyzing inclusion relationships between the sequences of the Registry entries, we build a network that can be related to the Registry abstraction hierarchy. The distribution of entry reuse and complexity was extracted from this network. The collection of clones associated with the database entries was also analyzed. The plasmid inserts were amplified and sequenced. The sequences of 162 inserts could be confirmed experimentally but unexpected discrepancies have also been identified. Conclusions/Significance Organizational guidelines are proposed to help design and manage this new type of scientific resources. In particular, it appears necessary to compare the cost of ensuring the integrity of database entries and associated biological samples with their value to the users. The initial strategy that permits including any combination of parts irrespective of its potential value leads to an exponential and economically unsustainable growth that may be detrimental to the quality and long-term value of the resource to its users.
PLOS Computational Biology | 2009
Yizhi Cai; Matthew W. Lux; Laura Adam; Jean Peccoud
Recognizing that certain biological functions can be associated with specific DNA sequences has led various fields of biology to adopt the notion of the genetic part. This concept provides a finer level of granularity than the traditional notion of the gene. However, a method of formally relating how a set of parts relates to a function has not yet emerged. Synthetic biology both demands such a formalism and provides an ideal setting for testing hypotheses about relationships between DNA sequences and phenotypes beyond the gene-centric methods used in genetics. Attribute grammars are used in computer science to translate the text of a program source code into the computational operations it represents. By associating attributes with parts, modifying the value of these attributes using rules that describe the structure of DNA sequences, and using a multi-pass compilation process, it is possible to translate DNA sequences into molecular interaction network models. These capabilities are illustrated by simple example grammars expressing how gene expression rates are dependent upon single or multiple parts. The translation process is validated by systematically generating, translating, and simulating the phenotype of all the sequences in the design space generated by a small library of genetic parts. Attribute grammars represent a flexible framework connecting parts with models of biological function. They will be instrumental for building mathematical models of libraries of genetic constructs synthesized to characterize the function of genetic parts. This formalism is also expected to provide a solid foundation for the development of computer assisted design applications for synthetic biology.
Science | 2017
Ze Xiong Xie; Bing-Zhi Li; Leslie A. Mitchell; Yi Wu; Xin Qi; Zhu Jin; Bin Jia; Xia Wang; Bo Xuan Zeng; Hui Min Liu; Xiao Le Wu; Qi Feng; Wen Zheng Zhang; Wei Liu; Ming Zhu Ding; Xia Li; Guang Rong Zhao; Jian Jun Qiao; Jing Sheng Cheng; Meng Zhao; Zheng Kuang; Xuya Wang; J. Andrew Martin; Giovanni Stracquadanio; Kun Yang; Xue Bai; Juan Zhao; Meng Long Hu; Qiu Hui Lin; Wen Qian Zhang
INTRODUCTION The Saccharomyces cerevisiae 2.0 project (Sc2.0) aims to modify the yeast genome with a series of densely spaced designer changes. Both a synthetic yeast chromosome arm (synIXR) and the entirely synthetic chromosome (synIII) function with high fitness in yeast. For designer genome synthesis projects, precise engineering of the physical sequence to match the specified design is important for the systematic evaluation of underlying design principles. Yeast can maintain nuclear chromosomes as rings, occurring by chance at repeated sequences, although the cyclized format is unfavorable in meiosis given the possibility of dicentric chromosome formation from meiotic recombination. Here, we describe the de novo synthesis of synthetic yeast chromosome V (synV) in the “Build-A-Genome China” course, perfectly matching the designer sequence and bearing loxPsym sites, distinguishable watermarks, and all the other features of the synthetic genome. We generated a ring synV derivative with user-specified cyclization coordinates and characterized its performance in mitosis and meiosis. RATIONALE Systematic evaluation of underlying Sc2.0 design principles requires that the final assembled synthetic genome perfectly match the designed sequence. Given the size of yeast chromosomes, synthetic chromosome construction is performed iteratively, and new mutations and unpredictable events may occur during synthesis; even a very small number of unintentional nucleotide changes across the genome could have substantial effects on phenotype. Therefore, precisely matching the physical sequence to the designed sequence is crucial for verification of the design principles in genome synthesis. Ring chromosomes can extend those design principles to provide a model for genomic rearrangement, ring chromosome evolution, and human ring chromosome disorders. RESULTS We chemically synthesized, assembled, and incorporated designer chromosome synV (536,024 base pairs) of S. cerevisiae according to Sc2.0 principles, based on the complete nucleotide sequence of native yeast chromosome V (576,874 base pairs). This work was performed as part of the “Build-A-Genome China” course in Tianjin University. We corrected all mutations found—including duplications, substitutions, and indels—in the initial synV strain by using integrative cotransformation of the precise desired changes and by means of a clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9)–based method. Altogether, 3331 corrected base pairs were required to match to the designed sequence. We generated a strain that exactly matches all designer sequence changes that displays high fitness under a variety of culture conditions. All corrections were verified with whole-genome sequencing; RNA sequencing revealed only minor changes in gene expression—most notably, decreases in expression of genes relocated near synthetic telomeres as a result of design. We constructed a functional circular synV (ring_synV) derivative in yeast by precisely joining both chromosome ends (telomeres) at specified coordinates. The ring chromosome showed restoration of subtelomeric gene expression levels. The ring_synV strain exhibited fitness comparable with that of the linear synV strain, revealed no change in sporulation frequency, but notably reduced spore viability. In meiosis, heterozygous or homozygous diploid ring_wtV and ring_synV chromosomes behaved similarly, exhibiting substantially higher frequency of the formation of zero-spore tetrads, a type that was not seen in the rod chromosome diploids. Rod synV chromosomes went through meiosis with high spore viability, despite no effort having been made to preserve meiotic competency in the design of synV. CONCLUSION The perfect designer-matched synthetic chromosome V provides strategies to edit sequence variants and correct unpredictable events, such as off-target integration of extra copies of synthetic DNA elsewhere in the genome. We also constructed a ring synthetic chromosome derivative and evaluated its fitness and stability in yeast. Both synV and synVI can be circularized and can power yeast cell growth without affecting fitness when gene content is maintained. These fitness and stability phenotypes of the ring synthetic chromosome in yeast provide a model system with which to probe the mechanism of human ring chromosome disorders. Synthesis, cyclization, and characterization of synV. (A) Synthetic chromosome V (synV, 536,024 base pairs) was designed in silico from native chromosome V (wtV, 576,874 base pairs), with extensive genotype modification designed to be phenotypically neutral. (B) CRISPR/Cas9 strategy for multiplex repair
Science | 2017
Leslie A. Mitchell; Ann Wang; Giovanni Stracquadanio; Zheng Kuang; Xuya Wang; Kun Yang; Sarah M. Richardson; J. Andrew Martin; Yu Zhao; Roy Walker; Hongjiu Dai; Kang Dong; Zuojian Tang; Yanling Yang; Yizhi Cai; Adriana Heguy; Beatrix Ueberheide; David Fenyö; Junbiao Dai; Joel S. Bader; Jef D. Boeke
INTRODUCTION Total synthesis of designer chromosomes and genomes is a new paradigm for the study of genetics and biological systems. The Sc2.0 project is building a designer yeast genome from scratch to test and extend the limits of our biological knowledge. Here we describe the design, rapid assembly, and characterization of synthetic chromosome VI (synVI). Further, we investigate the phenotypic, transcriptomic, and proteomic consequences associated with consolidation of three synthetic chromosomes–synVI, synIII, and synIXR—into a single poly-synthetic strain. RATIONALE A host of Sc2.0 chromosomes, including synVI, have now been constructed in discrete strains. With debugging steps, where the number of bugs scales with chromosome length, all individual synthetic chromosomes have been shown to power yeast cells to near wild-type (WT) fitness. Testing the effects of Sc2.0 chromosome consolidation to uncover possible synthetic genetic interactions and/or perturbations of native cellular networks as the number of designer changes increases is the next major step for the Sc2.0 project. RESULTS SynVI was rapidly assembled using nine sequential steps of SwAP-In (switching auxotrophies progressively by integration), yielding a ~240-kb synthetic chromosome designed to Sc2.0 specifications. We observed partial silencing of the left- and rightmost genes on synVI, each newly positioned subtelomerically relative to their locations on native VI. This result suggests that consensus core X elements of Sc2.0 universal telomere caps are insufficient to fully buffer telomere position effects. The synVI strain displayed a growth defect characterized by an increased frequency of glycerol-negative colonies. The defect mapped to a synVI design feature in the essential PRE4 gene (YFR050C), encoding the β7 subunit of the 20S proteasome. Recoding 10 codons near the 3′ end of the PRE4 open reading frame (ORF) caused a ~twofold reduction in Pre4 protein level without affecting RNA abundance. Reverting the codons to the WT sequence corrected both the Pre4 protein level and the phenotype. We hypothesize that the formation of a stem loop involving recoded codons underlies reduced Pre4 protein level. Sc2.0 chromosomes (synI to synXVI) are constructed individually in discrete strains and consolidated into poly-synthetic (poly-syn) strains by “endoreduplication intercross.” Consolidation of synVI with synthetic chromosomes III (synIII) and IXR (synIXR) yields a triple-synthetic (triple-syn) strain that is ~6% synthetic overall—with almost 70 kb deleted, including 20 tRNAs, and more than 12 kb recoded. Genome sequencing of double-synthetic (synIII synVI, synIII synIXR, synVI synIXR) and triple-syn (synIII synVI synIXR) cells indicates that suppressor mutations are not required to enable coexistence of Sc2.0 chromosomes. Phenotypic analysis revealed a slightly slower growth rate for the triple-syn strain only; the combined effect of tRNA deletions on different chromosomes might underlie this result. Transcriptome and proteome analyses indicate that cellular networks are largely unperturbed by the existence of multiple synthetic chromosomes in a single cell. However, a second bug on synVI was discovered through proteomic analysis and is associated with alteration of the HIS2 transcription start as a consequence of tRNA deletion and loxPsym site insertion. Despite extensive genetic alterations across 6% of the genome, no major global changes were detected in the poly-syn strain “omics” analyses. CONCLUSION Analyses of phenotypes, transcriptomics, and proteomics of synVI and poly-syn strains reveal, in general, WT cell properties and the existence of rare bugs resulting from genome editing. Deletion of subtelomeres can lead to gene silencing, recoding deep within an ORF can yield a translational defect, and deletion of elements such as tRNA genes can lead to a complex transcriptional output. These results underscore the complementarity of transcriptomics and proteomics to identify bugs, the consequences of designer changes in Sc2.0 chromosomes. The consolidation of Sc2.0 designer chromosomes into a single strain appears to be exceptionally well tolerated by yeast. A predictable exception to this is the deletion of tRNAs, which will be restored on a separate neochromosome to avoid synthetic lethal genetic interactions between deleted tRNA genes as additional synthetic chromosomes are introduced. Debugging synVI and characterization of poly-synthetic yeast cells. (A) The second Sc2.0 chromosome to be constructed, synVI, encodes a “bug” that causes a variable colony size, dubbed a “glycerol-negative growth-suppression defect.” (B) Synonymous changes in the essential PRE4 ORF lead to a reduced protein level, which underlies the growth defect