Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stefan Canzar is active.

Publication


Featured researches published by Stefan Canzar.


Bioinformatics | 2013

GAGE-B: an evaluation of genome assemblers for bacterial organisms.

Tanja Magoc; Stephan Pabinger; Stefan Canzar; Xinyue Liu; Qi Su; Daniela Puiu; Luke J. Tallon

Motivation: A large and rapidly growing number of bacterial organisms have been sequenced by the newest sequencing technologies. Cheaper and faster sequencing technologies make it easy to generate very high coverage of bacterial genomes, but these advances mean that DNA preparation costs can exceed the cost of sequencing for small genomes. The need to contain costs often results in the creation of only a single sequencing library, which in turn introduces new challenges for genome assembly methods. Results: We evaluated the ability of multiple genome assembly programs to assemble bacterial genomes from a single, deep-coverage library. For our comparison, we chose bacterial species spanning a wide range of GC content and measured the contiguity and accuracy of the resulting assemblies. We compared the assemblies produced by this very high-coverage, one-library strategy to the best assemblies created by two-library sequencing, and we found that remarkably good bacterial assemblies are possible with just one library. We also measured the effect of read length and depth of coverage on assembly quality and determined the values that provide the best results with current algorithms. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Journal of Computational Biology | 2013

Charge Group Partitioning in Biomolecular Simulation

Stefan Canzar; Mohammed El-Kebir; René Pool; Khaled M. Elbassioni; Alan E. Mark; Daan P. Geerke; Leen Stougie; Gunnar W. Klau

Molecular simulation techniques are increasingly being used to study biomolecular systems at an atomic level. Such simulations rely on empirical force fields to represent the intermolecular interactions. There are many different force fields available--each based on a different set of assumptions and thus requiring different parametrization procedures. Recently, efforts have been made to fully automate the assignment of force-field parameters, including atomic partial charges, for novel molecules. In this work, we focus on a problem arising in the automated parametrization of molecules for use in combination with the GROMOS family of force fields: namely, the assignment of atoms to charge groups such that for every charge group the sum of the partial charges is ideally equal to its formal charge. In addition, charge groups are required to have size at most k. We show NP-hardness and give an exact algorithm that solves practical problem instances to provable optimality in a fraction of a second.


Cell | 2017

Temporal control of mammalian cortical neurogenesis by m6A methylation

Ki Jun Yoon; Francisca Rojas Ringeling; Caroline Vissers; Fadi Jacob; Michael Pokrass; Dennisse Jimenez-Cyrus; Yijing Su; Nam Shik Kim; Yunhua Zhu; Lily Zheng; Sunghan Kim; Xinyuan Wang; Louis C. Doré; Peng Jin; Sergi Regot; Xiaoxi Zhuang; Stefan Canzar; Chuan He; Guo Li Ming; Hongjun Song

N6-methyladenosine (m6A), installed by the Mettl3/Mettl14 methyltransferase complex, is the most prevalent internal mRNA modification. Whether m6A regulates mammalian brain development is unknown. Here, we show that m6A depletion by Mettl14 knockout in embryonic mouse brains prolongs the cell cycle of radial glia cells and extends cortical neurogenesis into postnatal stages. m6A depletion by Mettl3 knockdown also leads to a prolonged cell cycle and maintenance of radial glia cells. m6A sequencing of embryonic mouse cortex reveals enrichment of mRNAs related to transcription factors, neurogenesis, the cell cycle, and neuronal differentiation, and m6A tagging promotes their decay. Further analysis uncovers previously unappreciated transcriptional prepatterning in cortical neural stem cells. m6A signaling also regulates human cortical neurogenesis in forebrain organoids. Comparison of m6A-mRNA landscapes between mouse and human cortical neurogenesis reveals enrichment of human-specific m6A tagging of transcripts related to brain-disorder risk genes. Our study identifies an epitranscriptomic mechanism in heightened transcriptional coordination during mammalian cortical neurogenesis.


Bioinformatics | 2012

CLEVER: clique-enumerating variant finder

Tobias Marschall; Ivan G. Costa; Stefan Canzar; Markus Bauer; Gunnar W. Klau; Alexander Schliep; Alexander Schönhuth

MOTIVATION Next-generation sequencing techniques have facilitated a large-scale analysis of human genetic variation. Despite the advances in sequencing speed, the computational discovery of structural variants is not yet standard. It is likely that many variants have remained undiscovered in most sequenced individuals. RESULTS Here, we present a novel internal segment size based approach, which organizes all, including concordant, reads into a read alignment graph, where max-cliques represent maximal contradiction-free groups of alignments. A novel algorithm then enumerates all max-cliques and statistically evaluates them for their potential to reflect insertions or deletions. For the first time in the literature, we compare a large range of state-of-the-art approaches using simulated Illumina reads from a fully annotated genome and present relevant performance statistics. We achieve superior performance, in particular, for deletions or insertions (indels) of length 20-100 nt. This has been previously identified as a remaining major challenge in structural variation discovery, in particular, for insert size based approaches. In this size range, we even outperform split-read aligners. We achieve competitive results also on biological data, where our method is the only one to make a substantial amount of correct predictions, which, additionally, are disjoint from those by split-read aligners. AVAILABILITY CLEVER is open source (GPL) and available from http://clever-sv.googlecode.com. CONTACT [email protected] or [email protected]. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


BMC Bioinformatics | 2010

Computing H/D-Exchange rates of single residues from data of proteolytic fragments

Ernst Althaus; Stefan Canzar; Carsten Ehrler; Mark R. Emmett; Andreas Karrenbauer; Alan G. Marshall; Anke Meyer-Bäse; Jeremiah D. Tipton; Hui Min Zhang

BackgroundProtein conformation and protein/protein interaction can be elucidated by solution-phase Hydrogen/Deuterium exchange (sHDX) coupled to high-resolution mass analysis of the digested protein or protein complex. In sHDX experiments mutant proteins are compared to wild-type proteins or a ligand is added to the protein and compared to the wild-type protein (or mutant). The number of deuteriums incorporated into the polypeptides generated from the protease digest of the protein is related to the solvent accessibility of amide protons within the original protein construct.ResultsIn this work, sHDX data was collected on a 14.5 T FT-ICR MS. An algorithm was developed based on combinatorial optimization that predicts deuterium exchange with high spatial resolution based on the sHDX data of overlapping proteolytic fragments. Often the algorithm assigns deuterium exchange with single residue resolution.ConclusionsWith our new method it is possible to automatically determine deuterium exchange with higher spatial resolution than the level of digested fragments.


acm symposium on applied computing | 2008

Computing H/D-exchange speeds of single residues from data of peptic fragments

Ernst Althaus; Stefan Canzar; Mark R. Emmett; Andreas Karrenbauer; Alan G. Marshall; Anke Meyer-Baese; Hui-Min Zhang

Determining the hydrogen-deuterium exchange speeds of single residues from data for peptic fragments obtained by FT-ICS MS is currently mainly done by manual interpretation. We provide an automated method based on combinatorial optimization. More precisely, we present an algorithm that enumerates all possible exchange speeds for single residues that explain the observed data of the peptic fragments.


Bioinformatics | 2016

BASIC: BCR assembly from single cells

Stefan Canzar; Karlynn E. Neu; Qingming Tang; Patrick C. Wilson; Aly A. Khan

Motivation: The B‐cell receptor enables individual B cells to identify diverse antigens, including bacterial and viral proteins. While advances in RNA‐sequencing (RNA‐seq) have enabled high throughput profiling of transcript expression in single cells, the unique task of assembling the full‐length heavy and light chain sequences from single cell RNA‐seq (scRNA‐seq) in B cells has been largely unstudied. Results: We developed a new software tool, BASIC, which allows investigators to use scRNA‐seq for assembling BCR sequences at single‐cell resolution. To demonstrate the utility of our software, we subjected nearly 200 single human B cells to scRNA‐seq, assembled the full‐length heavy and the light chains, and experimentally confirmed these results by using single‐cell primer‐based nested PCRs and Sanger sequencing. Availability and Implementation: http://ttic.uchicago.edu/˜aakhan/BASIC Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


scandinavian workshop on algorithm theory | 2008

Approximating the Interval Constrained Coloring Problem

Ernst Althaus; Stefan Canzar; Khaled M. Elbassioni; Andreas Karrenbauer; Julián Mestre

We consider the interval constrained coloringproblem, which appears in the interpretation of experimental data in biochemistry. Monitoring hydrogen-deuterium exchange rates via mass spectroscopy experiments is a method used to obtain information about protein tertiary structure. The output of these experiments provides data about the exchange rate of residues in overlapping segments of the protein backbone. These segments must be re-assembled in order to obtain a global picture of the protein structure. The interval constrained coloringproblem is the mathematical abstraction of this re-assembly process. The objective of the interval constrained coloring problem is to assign a color (exchange rate) to a set of integers (protein residues) such that a set of constraints is satisfied. Each constraint is made up of a closed interval (protein segment) and requirements on the number of elements that belong to each color class (exchange rates observed in the experiments). We show that the problem is NP-complete for arbitrary number of colors and we provide algorithms that given a feasible instance find a coloring that satisfies all the coloring requirements within ±1 of the prescribed value. In light of our first result, this is essentially the best one can hope for. Our approach is based on polyhedral theory and randomized rounding techniques. Furthermore, we develop a quasi-polynomial-time approximation scheme for a variant of our problem where we are asked to find a coloring satisfying as many fragments as possible.


Genome Biology | 2016

CIDANE: comprehensive isoform discovery and abundance estimation

Stefan Canzar; Sandro Andreotti; David Weese; Knut Reinert; Gunnar W. Klau

We present CIDANE, a novel framework for genome-based transcript reconstruction and quantification from RNA-seq reads. CIDANE assembles transcripts efficiently with significantly higher sensitivity and precision than existing tools. Its algorithmic core not only reconstructs transcripts ab initio, but also allows the use of the growing annotation of known splice sites, transcription start and end sites, or full-length transcripts, which are available for most model organisms. CIDANE supports the integrated analysis of RNA-seq and additional gene-boundary data and recovers splice junctions that are invisible to other methods. CIDANE is available at http://ccb.jhu.edu/software/cidane/.


Proceedings of the IEEE | 2017

Short Read Mapping: An Algorithmic Tour

Stefan Canzar

Ultra-high-throughput next-generation sequencing (NGS) technology allows us to determine the sequence of nucleotides of many millions of DNA molecules in parallel. Accompanied by a dramatic reduction in cost since its introduction in 2004, NGS technology has provided a new way of addressing a wide range of biological and biomedical questions, from the study of human genetic disease to the analysis of gene expression, protein–DNA interactions, and patterns of DNA methylation. The data generated by NGS instruments comprise huge numbers of very short DNA sequences, or “reads,” that carry little information by themselves. These reads therefore have to be pieced together by well-engineered algorithms to reconstruct biologically meaningful measurements, such as the level of expression of a gene. To solve this complex, high-dimensional puzzle, reads must be mapped back to a reference genome to determine their origin. Due to sequencing errors and to genuine differences between the reference genome and the individual being sequenced, this mapping process must be tolerant of mismatches, insertions, and deletions. Although optimal alignment algorithms to solve this problem have long been available, the practical requirements of aligning hundreds of millions of short reads to the 3-billion-base-pair-long human genome have stimulated the development of new, more efficient methods, which today are used routinely throughout the world for the analysis of NGS data.

Collaboration


Dive into the Stefan Canzar's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mark R. Emmett

Florida State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge