Computational tools for drawing, building and displaying carbohydrates: a visual guide
22448
Computational tools for drawing, building and displayingcarbohydrates: a visual guide
Kanhaya Lal ‡1,2 , Rafael Bermeo ‡1,2 and Serge Perez *1 Review
Open Access
Address: Univ. Grenoble Alpes, CNRS, CERMAV, 38000 Grenoble, Franceand Dipartimento di Chimica, Università Degli Studi di Milano, viaGolgi 19, I-20133, ItalyEmail:Serge Perez * - [email protected]* Corresponding author ‡ Equal contributorsKeywords:bioinformatics; carbohydrate; glycan; glycobiology; nomenclature;oligosaccharide; polysaccharide; representation; structure Beilstein J. Org. Chem.
Abstract
Drawing and visualisation of molecular structures are some of the most common tasks carried out in structural glycobiology, typi-cally using various software. In this perspective article, we outline developments in the computational tools for the sketching, visu-alisation and modelling of glycans. The article also provides details on the standard representation of glycans, and glycoconjugates,which helps the communication of structure details within the scientific community. We highlight the comparative analysis of theavailable tools which could help researchers to perform various tasks related to structure representation and model building ofglycans. These tools can be useful for glycobiologists or any researcher looking for a ready to use, simple program for the sketchingor building of glycans.
Introduction
Glycoscience is a rapidly surfacing and evolving scientificdiscipline. One of its current challenges is to keep up and adaptto the increasing levels of data available in the present scien-tific environment. Indeed, the rise of accessible experiment datahas changed the landscape of how research is performed. Theaccessibility of this information, coupled with the emergence ofnew platforms and technologies, has benefitted glycoscience tothe point of enabling the detection and high-resolution determi-nation and representation of complex glycans [1]. Increasing numbers of carbohydrate sequences have accumulated throug-hout extensive work in areas of chemical and biochemical frag-mentations followed by analysis using mass spectroscopy,nuclear magnetic resonance, crystallography and computationalmodelling. There have been some initiatives by independentresearch groups worldwide, that pushed the development ofvisual tools to improve some aspects of glycan identification,quantification and visualisation, some of which will be furtherdeveloped throughout this article. eilstein J. Org. Chem.
Biological molecules express their function throughout theirthree-dimensional structures. For this reason, structural biologyplaces great emphasis on the three-dimensional structure as acentral element in the characterisation of biological function.An adequate understanding of biomolecular mechanisms inher-ently requires our ability to model and visualise them. Visuali-sation of molecular structures is thus one of the most commontasks performed by structural biologists. As an essential part ofthe research process, data visualisation allows not only tocommunicate experimental results but also is a crucial step inthe integration of multiple data derived resources, such as ther-modynamics and kinetic analysis, glycan arrays, mutagenesis,etc. Data visualisation remains a challenge in glycoscience forboth the developers and the end-users even for the simple taskof describing molecular structures. Progress in this area allowsto translate a static visualisation of single molecules intodynamic views of complex interacting large macromolecularassemblies, which increases our understanding of biologicalprocesses.Representing the structures of carbohydrates has historicallybeen considered to be a complicated task. Starting from thelinear form of the Fischer projection, which is certainly not arealistic representation of a carbohydrate structure, there hasbeen a continuous development and evolution of the descrip-tion of monosaccharides [2]. Glycans are puzzles to manychemists, and biologists as well as bioinformaticians. This com-plexity occurs at different levels (which makes it incremental).Amongst the most recognisable “sugars”, glucose is merely oneof 60+ monosaccharides, all of which are, in truth, pairs of mir-ror-image enantiomers ( ʟ and ᴅ ).Moreover, monosaccharides occur as two forms: 5-atom ring(furanose) and 6-atom ring (pyranose). With the occurrence of astatistically rarer “open form,” we obtain at least 6 “correct”representations of glucose. And yet, monosaccharides are onlythe chemical units and the individual building blocks of muchmore complex molecules; the carbohydrates, also referred to asglycans. The glycan family can be grouped in the followingcategories: (i) oligosaccharides (comprising two to ten mono-saccharides linked together either linearly or branched); (ii)polysaccharides (for glycan chains composed of more than tenmonosaccharides); (iii) glycoconjugates (where the glycanchains are covalently linked to proteins (glycoproteins), lipids(glycolipids). The complexity of glycans is a consequence oftheir branched structure and the range of building blocks avail-able. Other levels of complexity include the nature of the glyco-sidic linkage (anomeric configuration, position and angles), thenumber of repeating units (polysaccharides) as well as the sub-stitutions of the monosaccharides. Regardless of the differentnomenclatures available to describe each monosaccharide, representing and encoding a glycan structure into a file is re-quired for communication among scientists as well as for dataprocessing.As a consequence, glycobiologists have proposed differentgraphical representations, with symbols or chemical structuresreplacing monosaccharides. The description of carbohydratestructures using standard symbolic nomenclature enables easyunderstanding and communication within the scientific commu-nity. Research groups working on carbohydrates have de-veloped schematic depictions with symbols [3] and expansionswith greyscale colouring as the so-called Oxford nomenclature(UOXF) [4,5], and even fully coloured schemes later on.Among these, some of the proposed representation forms havebeen accepted and implemented by several groups and initia-tives, namely the Consortium for Functional Glycomics (CFG)[6]. Whereas the initial versions of such representation werelimited to mammalian glycans, an extension of the graphicalrepresentation of glycans, called SNFG Symbol Nomenclaturefor Glycans (SNFG) [7,8] resulted from a joint internationalagreement. The newly proposed nomenclature covers 67 mono-saccharides aptly represented in eleven shapes and ten colours.There is the hope that it will cope better with the rapidlygrowing information on the structure and functions of glycansand polysaccharides from microbes, plants and algae. Therendering of glycan drawing and symbol representations moti-vated the development of several computer applications using astandardised notation. The earliest glycan editors allowedmanual drawing similar to ChemDraw or used input files withglycan sequence KCF (KEGG Chemical Function) [9] in textformat for similarity search against other structures deposited inthe databases. Later developments supported the constructionand representation of glycan structures in symbolic form bycomputational tools like GlycanBuilder [10]. Since then, severaladvancements have been made to allow the user to both drawglycans manually or by importing and exporting the structurefiles in different text formats [11].Along the same line, the development of various other applica-tions allowed the users to sketch 2D-glycan structures by drag-ging and dropping monosaccharides to canvas to generate 3Dstructures for further usages. These depictions comply with pro-tein data bank (PDB) [12] format, or in the form of images[13,14]. Besides, these tools for representing glycans in 2D and3D shape [15] allowed the integration of glycans into proteinstructures or complexes. The tools developed in the last fewyears have automated the sketching of glycans and glycopep-tides, allowing rapid display of structures using IUPAC format[16] as input. This article explores and illustrates the conceptsof “sketching”, “building” and “viewing” glycans (Figure 1). Itprovides a descriptive analysis of the tools available for such eilstein J. Org. Chem. Figure 1:
Levels of representation of glycans: from sketching to virtual reality. activities, which can be useful for researchers looking for aready-to-use simple program for sketching, building and 3Dstructure analysis of glycans and glycoconjugates. The scope ofthis work is relevant to N- and O-linked glycans, glycolipids,proteoglycans and glycosaminoglycans, lipopolysaccharides,plant, algal and bacterial polysaccharides.
Review
Methods
To facilitate glycoscience research, we have identified the toolsand databases that are freely available on the internet and areregularly updated and improved [1]. The variety and complexi-ty of glycan structures make their interpretation challenging.Consequently, in the past few years, several sketching, buildingand visualisation tools have been developed to depict better andunderstand the complex glycan structures. In this study, thefreely available tools have been visited (April 2020) andanalysed to highlight their core features but also explore theirunique advancements to facilitate glycan research. Each of thecomputational tools was inspected for general features related tosketching, representing and model building, all of which couldbe further used as input for translation into other formats, searchfrom glycan databases or complex calculations such as molecu-lar simulations. Several tools feature an interactive interfacewhich allows for manual editing of the structures. Examples of such tools are DrawRINGS [17], KegDraw [18], Glycano(available at http://glycano.cs.uct.ac.za/), GlycoEditor [19],GlycanBuilder [20], etc. These tools (except KegDraw) are pro-vided with the list of CFG symbols to freely build glycan struc-tures using the mouse on the canvas. In addition to manualsketching, some of these tools also can import text formats in-cluding IUPAC-condensed, GlycoCT and KEGG ChemicalFunction (KCF) format to display the glycan structures. Someapplications also facilitate glycan search in various databases.Another category of tools included in this study involved glycanviewers which can only depict structures using the IUPAC threeletter code or IUPAC-condensed nomenclature as input. Thesetools convert the input into a 2D image or 3D representationusing SNFG symbols or 3D-SNFG illustration. Additionally,3D representation of structures is provided by tools such asVisual Molecular Dynamics (VMD) [21] , and LiteMol [22],which allow for quick analysis of structural features in 3Dspace. All the tools mentioned were evaluated against a set ofpre-selected criteria relating to ease of use, scientific precisionand content, among others.Table S1 (Supporting Information File 1) schematicallysummarises how these criteria are fulfilled. The analysis of thetools for input and output formats also provided informationabout their versatility to convert results into the standard or eilstein J. Org. Chem. desired format. The tools have been attributed to categoriessuch as “sketcher”, “builder” and “viewer”, with eventual over-laps. A brief analysis of each application ordered by category isgiven in the next section.
Sketching with the free hand
As a preview of the following parts of this study, we performedan initial test of the tools available for the representation of asimple disaccharide: lactose ( β -D-Gal p -(1 → p ).Figure 2 shows how different web-available platforms renderedit. On the one hand, thanks to the unified nomenclature, there isno ambiguity regarding the nature of the carbohydrate repre-sented. On the other hand, small differences between sketchesappear. Such variations will multiply with the increasing com-plexity of the carbohydrates. It is, therefore, essential to choosewhich tools to use before starting an hour-long “drawing-spree”. The variations of the colour code used to represent themonosaccharides show striking differences across platformseven though the appropriate colours to be used are strictlyd e f i n e d (h t t p s : / / w w w . n c b i . n l m . n i h . g o v / g l y c a n s /snfg.html Building with scientific accuracy
The necessity for precision is what, at some point, turns carbo-hydrate sketching into building. What defines this turning point(besides a certain level of accuracy) is the intended purpose forthe produced figures/images. Scientific communication, com-parison between similar yet different structures, or merelyshowcasing the complexity of carbohydrates: all three casescannot rely on a sketching tool to convey their message. Conse-quently, a new set of considerations appears. The requirementfor accurate depiction comes from the complexity mentioned
Figure 2:
Depiction of lactose by various glycan sketching tools. above of carbohydrates: anomeric configuration, substitution,glycosidic bond position, and repeating units (as well as teth-ering to larger macromolecules, and more). For the sake ofaccuracy, only the right combination of characteristics shouldbe depicted, leaving no ambiguity: every relevant piece of datashould be detailed. The glycosidic linkage is a perfect exampleto illustrate the necessity for accuracy in building, as opposed tosketching. While a simple line is enough to link two monosac-charides, it is necessary to define the linkage as alpha or beta(or unknown) and to state the positions of the glycosyl acceptorand even donor. Cellulose and amylose are two glucose-basedpolysaccharides that differ only in the nature of their glycosidicbond, and yet they have entirely different shapes and so, biolog-ical roles. For the sake of completion, the full description of amonosaccharide should obey the following rules:
Force fields for carbohydrates, 3D modelbuilding and beyond
Carbohydrates present various challenges to the development offorce fields [23]. The tertiary structures of monosaccharidesusually have a high number of chiral centres which increasesthe structural diversity and complexity. The structural diversitychanges the electrostatic landscape of molecules; thus, itprovides challenges in the development of force fields for accu-rate modelling of such variations in charge distributions. Themonosaccharides can further form a large number of oligosac-charides which can enormously increase the conformationalspace, due to a high number of rotatable bonds. Nonetheless,recent developments in carbohydrate force fields enable tomodel and reproduce the energies associated with minutegeometrical changes. The currently available force fields whichare parameterised for carbohydrates are also capable of carryingout simulations of the oligosaccharides containing additionalgroups like sulfates, phosphates etc. [24] Generally used forcefields for the Molecular Mechanics (MD) simulation of carbo-hydrates are CHARMM [25], GLYCAM [26], and GROMOS[27]. The structural complexity increases the computationalcost, which makes simulations of large systems more chal-lenging. Therefore, coarse-grained models [28] for carbo-hydrates are generally used for molecular modelling of largesystems.In terms of 3D model building, the complex topologies ofglycans require dedicated molecular building procedures toconvert sequence information into reliable 3D models. Thesetools generally use 3D molecular templates of monosaccharides to reconstruct a 3D model. Energy minimisation methods canfurther refine the models. These models are essential for struc-ture-based studies and complex calculations like Molecular Dy-namics simulations. Therefore, the accurate model buildingrequires the use of reliable databases to generate atomic coordi-nates and topology to provide an acceptable model. Some of thecomputational tools usually contain atom coordinates of gener-ally used monosaccharides (as templates) and also use librariesof bond and angle parameters from various force fields dedi-cated for carbohydrates. The accurately predicted oligosaccha-ride conformations are good starting points for further investi-gations. Of particular interest are the evaluations of the dynam-ics of glycans and their interactions with proteins which is amost significant concern in glycoscience. The joint need tobetter perceive and manipulate the three-dimensional objectsthat make up molecular structures is leading to a rapid appropri-ation of techniques of Virtual Reality (VR) by the molecularbiology community. Generic definitions describe VR as beingimmersion in an interactive virtual reactive world. The comput-er-generated graphics provide a realistic rendering of an immer-sive and dynamic environment that responds to the user'srequests. One finds in these definitions the three pillars thatdefine VR: Immersion, Interaction, Information. Although it isdifficult to extract a single, simple definition of VR, the mainidea is to put the user at the centre of a dynamic and reactiveVR environment, artificially created and which will supplantthe real world for the time of the experiment.
Input and output for sketching, building anddisplaying applications
The variety and complexity of carbohydrate structures hamperthe use of a unique nomenclature. The choice of notationdepends on whether the study is focused on chemistry or has amore biological approach. The IUPAC-IUBMB (InternationalUnion for Pure and Applied Chemistry and International Unionfor Biochemistry and Molecular Biology) terminologies, in theirextended and condensed forms [16], govern the naming of theprimary structure or sequence.Further down the line, the complexity of the existing nomencla-tures for carbohydrate-containing molecules remains a signifi-cant hurdle to their practical use and exchanges within andoutside the glycoscience cenacle. The linearisation of the de-scription of the structure is a way to cope with the description ofthe structural complexity. The proposed formats provide rulesto extract the structure of the branches and create a unique se-quence for the carbohydrate. The most commonly used formatsare IUPAC [16], GlycoCT [29], KCF [9], and WURCS [30].The sketching of carbohydrates using computational tools gen-erally requires the textual input and output in at least one of eilstein J. Org. Chem.
Figure 3:
Examples of different glycan structure text formats for thesame glycan. Data in these formats are generally used as input/outputin glycan drawing and 3D structure building tools. these formats (Figure 3). An alternate input method involvesmanual sketching of 2D glycan structures by dragging anddropping monosaccharide symbols on canvas (with or withoutgrids) to connect them further. This method makes thesketching tools more friendly and interactive as it does notrequire large text code as input. Both input methods arecompliant to the Symbol Nomenclature for Glycans (SNFG).Another symbolic representation that could clearly distinguishmonosaccharides in monochrome colours is the Oxford nota-tion [5]. In this method, dashed and solid lines represent thealpha and beta glycosidic linkages, respectively. There are fewtools which have implemented this method while other tools usetext to highlight this information in the structures. In addition tosketching tools, some applications, specific to the field ofcarbohydrates, provide the possibility to visualise and display3D structures. These visualisation tools accept strings or files intext formats (GlycoCT, IUPAC-condensed, KCF) to display thestructure via a graphical user interface. For instance, theDrawGlycan-SNFG [31] tool uses IUPAC-condensed nomen-clature for input string and converts it into a 2D image repre-sented in SNFG symbols. At the same time, the 3D-SNFG [15]can generate glycan structures by incorporating SNFG symbols in 3D space for further visualisation using the computationaltools like visual molecular dynamics (VMD) [21] LiteMol [22]and Sweet Unity Mol [32].
Glycan sketchers
SugarSketcher.
SugarSketcher [14] is a JavaScript interfacemodule currently included in the tool collectiono f G l y c o m i c s @ E x P A Sy ( a v a i l a b l e a t h t t p s : / /glycoproteome.expasy.org/sugarsketcher/) for online drawing ofglycan structures. The interactive graphical interface (Figure 4,top) allows glycan drawing by glycobiologists and non-expertusers. In particular, a “Quick Mode” helps users with limitedknowledge of glycans to build up a structure quickly as com-pared to the normal mode, which offers options related to thestructural features of complex carbohydrates (for example addi-tional monosaccharides, isomers, ring types, etc.). The buildingof glycan structures uses mouse and proceeds via a selection ofmonosaccharides, substituents and linkages from the list ofsymbols. However, some wrong combinations of choices canblock the interface, resulting in the need to re-start the process(SugarSketcher is on version beta 1.3). Alternatively, SugarS-ketcher also uses GlycoCT or a native template library as aninput. A list of pre-built core N- and O-linked carbohydratemoieties, which are usually present in glycoproteins structures,can be used as a template for further modification. A shortlist ofglycan epitopes is also included providing templates fordrawing more complex molecules. The software uses theSymbol Nomenclature for Glycans (SNFG) notation for struc-ture representation and exports the obtained sketch to textformat (GlycoCT) or image (.svg) files. The software SugarS-ketcher is featured in the web portal GlyCosmos (https://glycosmos.org/glytoucans/graphic) [33]. Under the name“SugarDrawer”, it provides an interface for generating carbo-hydrate structures to query the database included inGlyCosmos: GlyTouCan [34].GlyCosmos is a web portal that integrates resources linkingglycosciences with life sciences. Besides elements such as“SugarDrawer” and GlyTouCan (carbohydrate database), theplatform GlyCosmos assembles data resources ranging fromglycoscience standard ontologies to pathologies associated withglycans. GlyCosmos is recognized as the official portal of theJapanese Society for Carbohydrate Research and provides infor-mation about genes, proteins, lipids, pathways and diseases.GlyTouCan (Figure 5) is a repository for glycans which isfreely available for the registry of glycan structures. The reposi-tory can register structures ranging from monosaccharide com-positions to fully defined structures of glycans. It assigns aunique accession number to any glycan to identify its structureand even allows to know its ID number in other databases. Al- eilstein J. Org. Chem.
Figure 4:
From top to bottom: SugarSketcher [36] interface with a glycan structure drawn using the “Quick Mode”. LiGraph interface showing inputand output options for glycan structure representation. GlycoGlyph [37] interface with a text input (modified IUPAC condensed) converted into itsglycan image. eilstein J. Org. Chem.
Figure 5:
GlyTouCan [38] interface allows to search for glycans structures in the database. Data contained in GlyCosmos portal (https://glycosmos.org/) and in GlyTouCan repository home page (https://glytoucan.org/), including their logos, are licensed under a Creative Commons Attri-bution 4.0 International License (https://creativecommons.org/licenses/by/4.0/). ternatively, users can search and retrieve information about theglycan structures and motifs that have been already registeredinto the repository. The structures can be searched simply bybrowsing through the list of already registered glycans or byspecifying a particular sub-structure to retrieve structurally sim-ilar glycans (https://glytoucan.org/Structures/graphical). Thesoftware tool featured in the GlyTouCan website is calledGlycanBuilder and is presented in a later section of our analy-sis.Recapitulating, SugarSketcher can be an efficient tool for non-glycobiologists or glycobiologists to sketch glycans. However,it does not accept different input or output formats like IUPAC,WURCS (Web3 Unique Representation of Carbohydrate Struc-tures), which would make the tool more versatile.
LiGraph.
GlycoGlyph.
GlycoGlyph [39] is a web-based application(available at https://glycotoolkit.com/Tools/GlycoGlyph/) builtusing JavaScript which allows users to draw structures using agraphical user interface or via text string in the CFG linear (alsoknown as modified IUPAC condensed) nomenclature dynami-cally. The interface (Figure 4, bottom) is equipped with tem-plates for N- and O-linked glycans and terminals. Also, itprovides 80+ monosaccharide (SNFG) symbols and a selectionfor substituents. The selected template or text string (in CFGlinear nomenclature) input directly gets converted into an imagein canvas and also appears as text in GlycoCT format. Theoutput can be saved as a .svg file or as GlycoCT text. The inter-face also provides additional options to add, replace or deleteeach monosaccharide, modify the sizes of symbols and text eilstein J. Org. Chem. fonts, and turn off the linkage annotations or change their orien-tation; all of which increases the usability of the software. Theinput structure can be further used to search the GlyTouCan[34] database to explore the literature details related to the inputstructure.GlycoGlyph is an efficient tool for sketching or buildingglycans with a highly usable interface that can significantly helpresearchers to improve the uniformity in glycan formats in liter-ature/manuscripts. It can also be a tool of choice for text miningfor the query structure.
GlycanBuilder2. , GLYcan structural DataExchange using Connection Tables (GLYDE-II), BacterialCarbohydrate Structures DataBase (BCSDB) [41], carbo-hydrate sequence markup language (CabosML) [42], CarbBank[43], LinearCode [44], LINUCS, IUPAC-condensed and Glyco-suiteDB [45]. The output yields structures in the followingformats: GlycoCT, LinearCode, GLYDE-II and LINUCS. Thus,GlycanBuilder2 is a versatile tool which can be used for glycansketching or building and also as a glycan sequence converterfrom one format to another.
Original GlycanBuilder
DrawRINGS.
DrawGlycan-SNFG.
DrawGlycan-SNFG [31] is an open-source program available with a web interface (Figure 7, top) at eilstein J. Org. Chem.
Figure 6:
From top to bottom: GlycanBuilder2 [46] interface with a glycan image in SNFG notation. Original GlycanBuilder [47] interface with some ofthe available templates rendered as images. DrawRINGS [48] interface featuring a glycan and its KCF text output. eilstein J. Org. Chem.
Figure 7:
From top to bottom: DrawGlycan-SNFG [51] web interface with a glycan text input and the resulting image output. Glycano [52] interfacewith a glycan structure. GlycoEditor [53] interface, linkage selection is triggered by adding a new monosaccharide .eilstein J. Org. Chem.
Glycano.
Glycano (available at http://glycano.cs.uct.ac.za) is asoftware tool for drawing glycans. This tool is based onJavaScript, which can be used without the requirement of anyserver or browser dependency. The interactive interface allowssketching via the drag-and-drop method on canvas (with orwithout grid). The software is provided with “UCT” and“ESN”, interchangeable interfaces (Figure 7, middle) with dif-ferent symbols for monosaccharides. These names (UCT andESN) correspond to the University of Cape Town, South Africa,where Glycano was developed, and to the “Essentials of Glyco-biology Symbol Nomenclature”, precursor of the SNFG symbolset [55]. The interface provides a wide choice of monosaccha- rides and substituents represented in SNFG symbols but lacksthe standard colour scheme. The user can easily modify thestructure with by click and drag, which allows to either cut/copy, delete or move a portion of the structure. The drawnstructure can be saved in text format, in .gly format or as animage (PNG and SVG formats). A drawback to note is thatlinking the monosaccharides at specific positions is onlypossible in the UCT mode, which means that back-and-forth be-tween the two symbol systems is necessary to define the link-ages correctly. Despite some drawbacks, this is an excellent tooldue to its ease-of-use, tenable degree of freedom, and function-alities/options for sketching and building glycan structures.
GlycoEditor.
GlycoEditor [19] (available at https://jcggdb.jp/idb/flash/GlycoEditor.jsp) is an online software for drawingglycans. Through a straightforward interface, three ways ofinput are possible: by JCGGDB ID, through a library ofcommon oligosaccharides and by direct input. A list of mostcommon monosaccharides is presented, and the rest can befound categorised by family. The click and drag addition of newmonosaccharides trigger the selection of linkage-type and con-figuration (Figure 7, bottom). The tool provides an option tocreate repeating units. Additionally, several functionalisationoptions are also available. Once the structure is ready, the usercan save it as an .xml file. GlycoEditor allows searching a givenstructure across many databases in four ways: exact structurematch (with or without anomer and linkage specifics) and thesame for substructure match. The central database featured isthe JCGGDB, to which can be added, among others: Glaxy,GlycomeDB, GlycoEpitope, GMDB, KEGG, etc. Searching byID is also possible. GlycoEditor is a now dated tool that allowsefficiently building glycans and performing databases searches.
GLYCO.ME (SugarBuilder).
Glyco.me-SugarBuilder (avail-able at https://beta.glyco.me/sugarbuilder) is online software fordrawing glycans. The interface leads to rapid carbohydrate con-struction. A panel of monosaccharide templates complementsthe drawing interface (one pre-built oligosaccharide is available(Figure 8, top). The user can start a chain from amino acidresidues: Asn, Ser or Thr, then structure building is limited byto a set of “rules” (limiting building options to known carbo-hydrates). These rules may be deactivated with a switch buttonto draw freely. A list of 13 monosaccharides is deployed, andsequential clicking allows their addition to the existing struc-ture and definition of the associated glycosidic bond (the rela-tive sizes of the options available related to their real statisticalvalue for that particular linkage). Upon building some specificmotifs, if they are recognised, an option for repeating unitsappears. Other switch buttons allow the user to change the ori-entation of the drawing, show/hide linkage information etc. TheOxford notation can be enabled for glycosidic bonds only. The eilstein J. Org. Chem.
Figure 8:
From top to bottom: Glyco.me SugarBuilder [56] interface with a glycan structure showing options to define anomericity and monosaccha-ride linkage position. KegDraw [57] interface with a glycan structure and available options to save the structure file in different formats. structure obtained can be rendered as .png or .svg images.Glyco.me - SugarBuilder is still under development: more mono-saccharides/substitutions/templates will complete an alreadyvery functional platform. The quick and easy options putforward offer natural building and liberty for tailoring therendered image.
KegDraw.
Glycan builders
Sweet II. eilstein J. Org. Chem. sweet2/doc/index.php (Figure 9, top). This tool is available as apart of the glycosciences.de website, which also provides otheroptions for analysing glycans in three-dimensional space. Thisprogram uses a glycan sequence in a standard format and gener-ates a 3D model in the form of a .pdb file. The glycan input cancome from a library of relevant oligosaccharides, availablethrough one of the sub-menus. Alternatively, manual input ispossible in three platforms adapted for increasing complexity.The model can be further minimised using MM2 [59] and MM3[60] methods. The 3D models can be viewed using molecularviewers like JMol, WebMol-applet, Chemis3D-applet, etc.Besides, the program also generates additional files which canbe used for molecular mechanics and molecular dynamics usingmolecular modelling tool like Tinker [61]. This tool is as aversatile tool for generating a 3D model for glycans.
GLYCAM-web (Carbohydrate Builder).
Carbohydratebuilder [65] is an online tool (at http://glycam.org/) for carbo-hydrate structure drawing and subsequent 3D structure building.With a flexible interface, it uses three methods for glycan build-ing. The first method is manual building (“CarbohydrateBuilder” button). It allows selection of monosaccharide, as wellas defining linkages, branching and substitution (Figure 9,middle). The second method involves the use of a templatelibrary (using “Oligosaccharide libraries” button) containingcommonly relevant structures (http://glycam.org/Pre-builtLi-braries.jsp). The third option (direct input from a text sequence)becomes relevant when the glycan structure does not exist in thelibrary or challenging to build due to structural complexity. Inthis case, a text for the oligosaccharide in GLYCAM-Web’scondensed notation can be entered as an input to create theglycan structure. Once the glycan is generated, the optionsinclude the solvation of the structure and the manual input ofthe glycosidic linkages. The tool allows structure minimisationand generates rotamers which can be visualised using JSmolviewer. Information about the force field that is used to buildthe structure is also provided. The multiple structures can bedownloaded compressed as .tar, .gz or .zip files containing .pdbfiles. Similarly, the 2D image can be saved in GIF format.GLYCAM-web- Carbohydrate Builder can be used to preparethe system for MD simulation as it solvates the glycans and alsogenerates the topology and coordinate files. In addition to itscarbohydrate builder, Glycam-web consists of additional toolslike glycoprotein builder and glycosaminoglycans (GAG)builder.
CHARMM-GUI (Glycan Reader and Modeler). doGlycans. doGlycans [69] is a compilation of tools dedicatedfor preparing carbohydrate structures for atomistic simulationsof glycoproteins, carbohydrate polymers and glycolipids usingGROMACS [70,71] In the form of Python scripts; the tools areused to prepare the system, which generally includes the pro-cessing of a.pdb file using the pdb2gmx tool. Subsequently, aglycosylation model can be prepared for carbohydrate polymersimulation using the prepreader.py script. Similarly, the dogly-cans.py script can be used to develop models for glycoproteinsand glycolipids. Together, these tools are called doGlycanstoolset. Although doGlycans is highly flexible, it only uses thesugar units that are defined in GLYCAM. The topologies gener-ated for glycosylated proteins and glycolipids are compatiblewith the OPLS [72] and AMBER [73] force fields. Thetopology for carbohydrate polymers is based on the GLYCAMforce field. The user needs to provide the ceramide topology asinput to generate the topologies for glycolipids. The toolscontained in doGlycans create 3D models and simulation filesas a starting point for more complex molecular simulationstudies. eilstein J. Org. Chem.
Figure 9:
From top to bottom: Sweet II [62] web-interface with a text input to generate a 3D model. GLYCAM Carbohydrate Builder [63] interfacewhich accepts a text input for glycans and generates 3D models. CHARMM-GUI (Glycan reader and Modeler) [64] interface with a 3D structure outputgenerated using a glycan sequence as input. eilstein J. Org. Chem.
Figure 10:
PolysGlycanBuilder [77] interface illustrating glycan drawing using SNFG symbols. The glycan can be further converted into a 3D model.
RosettaCarbohydrate.
PolysGlycanBuilder.
PolysGlycanBuilder [76] is a web-basedtool (http://glycan-builder.cermav.cnrs.fr/) with an interactiveand more usable interface (Figure 10). The software translates aglycan sequence or polysaccharide repeat unit into the coordi- nate set of the corresponding tertiary structure, in one or severalof its low energy conformations. The construction follows anintuitive scheme which is as close as possible to the way glyco-scientists draw the sequence of their structures. The simplestmethod for model building involves dragging and droppingmonosaccharide units to the canvas or workspace grid. The soft-ware displays rows of monosaccharides in the form of standardSNFG symbols with 3D information (furanose/pyranose shape,configuration, anomericity, and ring conformation). Glycosidiclinkages can be easily defined, as the values of the dihedralangles ( Φ , Ψ , Ω ). They can be manually set or extracted from adatabase of low energy conformations of 600 disaccharide seg-ments. The monosaccharides have been subjected to geometryoptimisation using molecular mechanics approach. For a giveninput sequence, the corresponding 3D coordinates are gener-ated at the PDB format. Within the process of construction, thestructure is displayed via the LiteMol and eventually optimisedto remove any steric clashes. The image for the glycan can bedownloaded and saved in SVG format. Keeping the glycan/polysaccharide structure in text format (condensed IUAPC,GlycoCT, SNFG and INP) offers several ways to connect toother applications. Other than drag and drop method, Polys-Glycan-Builder also accepts input of files in INP, IUPAC andGlycoCT formats. An interactive interface accompanies the ap-plication, which makes it more versatile for glycan drawing and3D model building. eilstein J. Org. Chem. Displaying 3D structures of glycans
Therecently introduced 3D-Symbol Nomenclature for Glycans (3D-SNFG) [15] allows the representation of carbohydrates in anunusual way: the SNFG symbols are added to a three-dimen-sional structure. The 3D-SNFG script must be integrated intothe visual molecular dynamics (VMD) [21,78] viewer softwareto enable the representation of glycans as large SNFG-matching3D shapes that can either replace the molecular mono-saccharides or stay lodged at the geometric centre of the cycle(Figure 11, top left). Upon the input of a glycan-containingstructure (in PDB format), the integrated script in VMD auto-matically recognises the common monosaccharide names andgenerates the 3D shapes. The embedded script also enablesshortcuts keys from keyboard to quickly change between largeand small 3D-SNFG shapes and also label the reducingterminus. The 3D structure displayed in VMD can be saved as a.bmp image file. Thanks to 3D-SNFG, the standardised repre-sentation of glycan structures can finally take a step into the 3Dspace. The obtained images can become very useful for quickassessment of 3D glycan models.In addition to the 3D-SNFG script,
PaperChain and
Twister [83] are two visualisation algorithms available with the VisualMolecular Dynamics (VMD) package. These algorithms areuseful to visualize complex cyclic molecules and multi-branched polysaccharides.{Cross, 2009
PaperChain displays rings in a molecular structure with a polygon andcolours them according to the ring pucker. The other algorithm(
Twister) traces glycosidic bonds in a ribbon representation thattwists and changes its orientation according to the relative posi-tion of following sugar residues, hence provides an importantconformational detail in polysaccharides. Combination of thesealgorithms with other visualisation features available in VMDcan enhance the flexibility of displaying structural details ofglycoconjugate, glycoprotein and cyclic structures.
LiteMol.
The LiteMol [22] viewer is a freely available web ap-plication (Figure 11, top right) for 3D visualisation of macro-molecules and other related data. LiteMol enables standardvisualisation of macromolecules in different representationmodes like surface, cartoons, ball-and-stick, etc. The softwarecan be accessed at v.litemol.org and also available for integra-tion in a webpage from the github (https://github.com/dsehnal/LiteMol). LiteMol is compatible with all modern browserswithout the support of additional plugins. The viewer automati-cally depicts any carbohydrate residues and displays 3D struc-tures of carbohydrates with 3D-SNFG symbols, which allowsthe viewer to identify the monosaccharides readily. Thepresented structure can be saved as a.png image file. Anymonosaccharide with a residue name in PDB can be visualised using 3D-SNFG in LiteMol. However, a significant portion ofthe carbohydrates may contain some form of error in annota-tion, which would result in either no symbol or an incorrectsymbol. Although LiteMol is an efficient and rapid 3D viewerfor glycans, 3D representation does not provide any informa-tion about the glycosidic linkage type (e.g. α β PyMOL- Azahar plugin.
UnityMol/SweetUnityMol.
Sweet UnityMol [32] is a molecu-lar structure viewer (Figure 11, middle) developed from thegame engine Unity3D. The software is available for free down-load (https://sourceforge.net/projects/unitymol/files/UnityMol_1.0.37/) from the SourceForge project website. It canbe installed in Mac, Windows and Linux platforms. Theprogram reads files in PDB, mmCIF, Mol2, GRO, XYZ, andSDF formats, OpenDX potential maps and XTC trajectory files.It efficiently displays specific structural features for the simplestto the most complex carbohydrate-containing biomolecules.Sweet UnityMol displays 3D carbohydrate structures with dif-ferent modes of representation, such as: liquorice, ball-and-stick, hyperBalls, RingBlending, hydrophilic/hydrophobic char-acter of sugar face etc. The most recent version is fully compati-ble with the SNFG colour coding, which also uses acceptablepictorial representation, generally used in carbohydrate chem-istry, biochemistry and glycobiology. eilstein J. Org. Chem.
Figure 11:
From top to bottom: 3D-SNFG representation of glycan using 3D-SNFG script integrated VMD [79]. LiteMol [80] interface with 3D-SNFGrepresentation of glycan in a protein–glycan complex. SweetUnityMol [81] among the several types of representations a ribbon-like display of polysac-charide ribbons maintains the SNFG colour coding of monosaccharides. UnityMol [82] within an immersive virtual reality context. eilstein J. Org. Chem.
SweetUnityMol provides a continuum from the conventionalways to depict the primary structures of complex carbohydratesall the way to visualising their 3D structures. Several optionsare offered to the user to select the most relevant type of depic-tions, including new features, such as “Coarse-Grain” represen-tation while keeping the option to display the details of theatomic representations. Powerful rendering methods producehigh-quality images of molecular structures, bio-macromolecu-lar surfaces and molecular interactions.A recently developed version of UnityMol has been imple-mented with the immersive Virtual Reality context using head-mounted displays [87]. It offers high-quality visual representa-tions, ease of interactions with multiple molecular objects, pow-erful tools for visual manipulations, accompanied by the evalua-tion of intermolecular interactions. Consequently, simultaneousinvestigations of multiple objects such as macromolecular inter-actions gain in efficiency and accuracy. (Figure 11, bottom).
Conclusion
The set of computational tools presented above illustrates therich contributions of a community devoted to enabling the accu-rate representation of complex carbohydrates via the develop-ment and implementation of a versatile informatics toolbox.These legitimate efforts aim at facilitating communicationwithin the scientific community. To establish a comparativeanalysis of the several available applications, we evaluated 17selected items that characterise best their availability, imple-mentation, maintenance and field of use. The comparative anal-ysis of tools could be useful for glycobiologists or anyresearcher looking for a ready to use, simple application for thesketching, building and display of glycans.This article provides an overview of the computational toolsand resources available for glycan sketching, building andrepresenting. It also provides a descriptive analysis of therecently developed software tools dedicated explicitly toglycans and glycoconjugates. The newly developed tools aremore advanced and use the standard nomenclature and symbolsfor glycan representation. These tools can further help to stan-dardise the description of glycans in research, communicationand databases.
Supporting Information
Supporting Information File 1
Acknowledgements
Appreciation is extended to Drs. A. Imberty, A. Varrot, L.Belvisi and A. Bernardi for their support.
Funding
This research was performed within the framework of thePhD4GlycoDrug Innovative Training Network and was fundedfrom the European Union’s Horizon 2020 research and innova-tion programme under the Marie Sk ł odowska-Curie grantagreement No 765581. The work was supported by the Cross-Disciplinary Program Glyco@Alps, within the framework“Investissement d’Avenir” program [ANR-15IDEX-02]. ORCID ® iDs Kanhaya Lal - https://orcid.org/0000-0001-8555-7948Rafael Bermeo - https://orcid.org/0000-0002-4451-878XSerge Perez - https://orcid.org/0000-0003-3464-5352
References
1. Alocci, D.; Lisacek, F.; Perez, S.
A Traveler’s Guide to ComplexCarbohydrates in the Cyber Space
Development of CarbohydrateNomenclature and Representation;
Springer, 2017; pp 7–25.doi:10.1007/978-4-431-56454-6_23. Kornfeld, S.; Li, E.; Tabas, I.
J. Biol. Chem.
Curr. Protoc. Protein Sci.
Proteomics Proteomics Glycobiology
Glycobiology
Nucleic Acids Res.
W267–W272.doi:10.1093/nar/gkh473 eilstein J. Org. Chem.
Source Code Biol. Med. No. 3. doi:10.1186/1751-0473-2-311. Damerell, D.; Ceroni, A.; Maass, K.; Ranzinger, R.; Dell, A.;Haslam, S. M. Annotation of Glycomics MS and MS/MS Spectra Usingthe GlycoWorkbench Software Tool. In
Glycoinformatics. Methods inMolecular Biology;
Lütteke, T.; Frank, M., Eds.; Humana Press: NewYork, NY, 2015; Vol. 1273, pp 3–15. doi:10.1007/978-1-4939-2343-4_112. Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.;Weissig, H.; Shindyalov, I. N.; Bourne, P. E.
Nucleic Acids Res.
Biopolymers ř eková, R.; Toukach, P.; Lisacek, F. Molecules
Glycobiology
Adv. Carbohydr. Chem. Biochem.
OMICS
Glycobiology
Glycoinformatics. Methods in MolecularBiology;
Lütteke, T.; Frank, M., Eds.; Humana Press: New York, NY,2015; Vol. 1273, pp 161–179. doi:10.1007/978-1-4939-2343-4_1220. Damerell, D.; Ceroni, A.; Maass, K.; Ranzinger, R.; Dell, A.;Haslam, S. M.
Biol. Chem.
J. Mol. Graphics
J. Proteome Res.
Wiley Interdiscip. Rev.: Comput. Mol. Sci. J. Chem. Theory Comput. J. Chem. Theory Comput. J. Comput. Chem.
J. Comput. Chem.
J. Phys. Chem. B
Carbohydr. Res.
J. Chem. Inf. Model.
Glycobiology
Glycobiology
Nat. Methods
Glycobiology
Glycobiology
Sugar Sketcher. https://glycoproteome.expasy.org/sugarsketcher/(accessed April 2020).37.
GlycoGlyph. https://glycotoolkit.com/Tools/GlycoGlyph/ (accessed April2020).38.
GlyTouCan . https://glytoucan.org/ (accessed April 2020).39. Mehta, A. Y.; Cummings, R. D.
Bioinformatics
Carbohydr. Res.
Nucleic Acids Res.
D1229–D1236. doi:10.1093/nar/gkv84042. Kikuchi, N.; Kameyama, A.; Nakaya, S.; Ito, H.; Sato, T.; Shikanai, T.;Takahashi, Y.; Narimatsu, H.
Bioinformatics
Trends Biochem. Sci.
Trends Glycosci. Glycotechnol.
Nucleic Acids Res.
GlycanBuilder2
SugarBind GlycanBuilder. https://sugarbind.expasy.org/builder(accessed April 2020).48.
DrawRINGS
J. Proteome Res. Nucleic Acids Res.
D1243–D1250. doi:10.1093/nar/gkv124751.
DrawGlycan-SNFG
Glycano. http://glycano.cs.uct.ac.za/ (accessed April 2020).53.
GlycoEditor . https://jcggdb.jp/idb/flash/GlycoEditor.jsp (accessed April2020).54. Cheng, K.; Pawlowski, G.; Yu, X.; Zhou, Y.; Neelamegham, S.
Bioinformatics . doi:10.1093/bioinformatics/btz819 eilstein J. Org. Chem.
Essentials of Glycobiology [Internet].
Glyco.me SugarBuilder. https://beta.glyco.me/sugarbuilder (accessedApril 2020).57.
KegDraw.
Bioinformatics
J. Am. Chem. Soc.
J. Am. Chem. Soc.
J. Chem. Theory Comput.
Sweet.
GLYCAM Web . (2005–2020) Complex Carbohydrate Research Center,University of Georgia, Athens, GA. (http://glycam.org).64.
CHARMM-GUI Glycan Reader & Modeler
J. Comput. Chem.
J. Comput. Chem.
Glycobiology
Nucleic Acids Res.
D470–D474.doi:10.1093/nar/gks98769. Danne, R.; Poojari, C.; Martinez-Seara, H.; Rissanen, S.; Lolicato, F.;Róg, T.; Vattulainen, I.
J. Chem. Inf. Model.
J. Chem. Theory Comput. J. Comput. Chem.
J. Chem. Theory Comput.
AMBER 2018;
University of California: San Francisco, 2018.74. Labonte, J. W.; Adolf-Bryfogle, J.; Schief, W. R.; Gray, J. J.
J. Comput. Chem.
Structure
Methods in Molecular Biology, Glycoinformatics,Methods and Protocols.
PolysGlycanBuilder. http://glycan-builder.cermav.cnrs.fr/ (accessedApril 2020).78. Kuttel, M.; Gain, J.; Burger, A.; Eborn, I.
J. Mol. Graphics Modell.
Downloaded from http://glycam.org/3d-snfg (accessed April2020).80.
LiteMol. https://v.litemol.org/ (accessed April 2020).81.
SweetUnityMol. https://sourceforge.net/projects/unitymol/files/OtherVersions/UnityMol-r676-SweetUnityMol/ (accessed April 2020).82.
UnityMol. https://sourceforge.net/projects/unitymol/files/ (accessed April2020).83. Cross, S.; Kuttel, M. M.; Stone, J. E.; Gain, J. E.
J. Mol. Graphics Modell.
J. Comput.-Aided Mol. Des.
PyMOL: An open-source molecular graphics tool;
Proc. Natl. Acad. Sci. U. S. A.
Biochem. Soc. Trans.
License and Terms
This is an Open Access article under the terms of theCreative Commons Attribution License(https://creativecommons.org/licenses/by/4.0). Please notethat the reuse, redistribution and reproduction in particularrequires that the authors and source are credited.The license is subject to the