Clemens Gröpl | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Clemens Gröpl is active.

Explore More

Publication

Featured researches published by Clemens Gröpl.

BMC Bioinformatics | 2008

OpenMS – An open-source software framework for mass spectrometry

Marc Sturm; Andreas Bertsch; Clemens Gröpl; Andreas Hildebrandt; Rene Hussong; Eva Lange; Nico Pfeifer; Ole Schulz-Trieglaff; Alexandra Zerck; Knut Reinert; Oliver Kohlbacher

BackgroundMass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever increasing amounts of data. Consequently, analysis of the data is currently often the bottleneck for experimental studies. Although software tools for many data analysis tasks are available today, they are often hard to combine with each other or not flexible enough to allow for rapid prototyping of a new analysis workflow.ResultsWe present OpenMS, a software framework for rapid application development in mass spectrometry. OpenMS has been designed to be portable, easy-to-use and robust while offering a rich functionality ranging from basic data structures to sophisticated algorithms for data analysis. This has already been demonstrated in several studies.ConclusionOpenMS is available under the Lesser GNU Public License (LGPL) from the project website at http://www.openms.de.

pacific symposium on biocomputing | 2005

High-accuracy peak picking of proteomics data using wavelet techniques.

Eva Lange; Clemens Gröpl; Knut Reinert; Oliver Kohlbacher; Andreas Hildebrandt

A new peak picking algorithm for the analysis of mass spectrometric (MS) data is presented. It is independent of the underlying machine or ionization method, and is able to resolve highly convoluted and asymmetric signals. The method uses the multiscale nature of spectrometric data by first detecting the mass peaks in the wavelet-transformed signal before a given asymmetric peak function is fitted to the raw data. In an optional third stage, the resulting fit can be further improved using techniques from nonlinear optimization. In contrast to currently established techniques (e.g. SNAP, Apex) our algorithm is able to separate overlapping peaks of multiply charged peptides in ESI-MS data of low resolution. Its improved accuracy with respect to peak positions makes it a valuable preprocessing method for MS-based identification and quantification experiments. The method has been validated on a number of different annotated test cases, where it compares favorably in both runtime and accuracy with currently established techniques. An implementation of the algorithm is freely available in our open source framework OpenMS.

intelligent systems in molecular biology | 2007

A geometric approach for the alignment of liquid chromatography—mass spectrometry data

Eva Lange; Clemens Gröpl; Ole Schulz-Trieglaff; Andreas Leinenbach; Christian G. Huber; Knut Reinert

MOTIVATION Liquid chromatography coupled to mass spectrometry (LC-MS) and combined with tandem mass spectrometry (LC-MS/MS) have become a prominent tool for the analysis of complex proteomic samples. An important step in a typical workflow is the combination of results from multiple LC-MS experiments to improve confidence in the obtained measurements or to compare results from different samples. To do so, a suitable mapping or alignment between the data sets needs to be estimated. The alignment has to correct for variations in mass and elution time which are present in all mass spectrometry experiments. RESULTS We propose a novel algorithm to align LC-MS samples and to match corresponding ion species across samples. Our algorithm matches landmark signals between two data sets using a geometric technique based on pose clustering. Variations in mass and retention time are corrected by an affine dewarping function estimated from matched landmarks. We use the pairwise dewarping in an algorithm for aligning multiple samples. We show that our pose clustering approach is fast and reliable as compared to previous approaches. It is robust in the presence of noise and able to accurately align samples with only few common ion species. In addition, we can easily handle different kinds of LC-MS data and adopt our algorithm to new mass spectrometry technologies. AVAILABILITY This algorithm is implemented as part of the OpenMS software library for shotgun proteomics and available under the Lesser GNU Public License (LGPL) at www.openms.de.

Archive | 2001

Approximation Algorithms for the Steiner Tree Problem in Graphs

Clemens Gröpl; Stefan Hougardy; Till Nierhoff; Hans Jtirgen PrOmel

Given a graph G = (V, E), a set R \(R \subseteq V\) V, and a length function on the edges, a Steiner tree is a connected subgraph of G that spans all vertices in R. (It might use vertices in V \ R as well.) The Steiner tree problem in graphs is to find a shortest Steiner tree, i.e., a Steiner tree whose total edge length is minimum. This problem is well known to be NP-hard [19] and therefore we cannot expect to find polynomial time algorithms for solving it exactly. This motivates the search for good approximation algorithms for the Steiner tree problem in graphs, i. e., algorithms that have polynomial running time and return solutions that are not far from an optimum solution.

BMC Bioinformatics | 2008

LC-MSsim – a simulation software for liquid chromatography mass spectrometry data

Ole Schulz-Trieglaff; Nico Pfeifer; Clemens Gröpl; Oliver Kohlbacher; Knut Reinert

BackgroundMass Spectrometry coupled to Liquid Chromatography (LC-MS) is commonly used to analyze the protein content of biological samples in large scale studies. The data resulting from an LC-MS experiment is huge, highly complex and noisy. Accordingly, it has sparked new developments in Bioinformatics, especially in the fields of algorithm development, statistics and software engineering. In a quantitative label-free mass spectrometry experiment, crucial steps are the detection of peptide features in the mass spectra and the alignment of samples by correcting for shifts in retention time. At the moment, it is difficult to compare the plethora of algorithms for these tasks. So far, curated benchmark data exists only for peptide identification algorithms but no data that represents a ground truth for the evaluation of feature detection, alignment and filtering algorithms.ResultsWe present LC-MSsim, a simulation software for LC-ESI-MS experiments. It simulates ESI spectra on the MS level. It reads a list of proteins from a FASTA file and digests the protein mixture using a user-defined enzyme. The software creates an LC-MS data set using a predictor for the retention time of the peptides and a model for peak shapes and elution profiles of the mass spectral peaks. Our software also offers the possibility to add contaminants, to change the background noise level and includes a model for the detectability of peptides in mass spectra. After the simulation, LC-MSsim writes the simulated data to mzData, a public XML format. The software also stores the positions (monoisotopic m/z and retention time) and ion counts of the simulated ions in separate files.ConclusionLC-MSsim generates simulated LC-MS data sets and incorporates models for peak shapes and contaminations. Algorithm developers can match the results of feature detection and alignment algorithms against the simulated ion lists and meaningful error rates can be computed. We anticipate that LC-MSsim will be useful to the wider community to perform benchmark studies and comparisons between computational tools.

international colloquium on automata languages and programming | 2003

Generating labeled planar graphs uniformly at random

Manuel Bodirsky; Clemens Gröpl; Mihyun Kang

We present an expected polynomial time algorithm to generate a labeled planar graph uniformly at random. To generate the planar graphs, we derive recurrence formulas that count all such graphs with n vertices and m edges, based on a decomposition into 1-, 2-, and 3- connected components. For 3-connected graphs we apply a recent random generation algorithm by Schaeffer and a counting formula by Mullin and Schellenberg.

research in computational molecular biology | 2007

A fast and accurate algorithm for the quantification of peptides from mass spectrometry data

Ole Schulz-Trieglaff; Rene Hussong; Clemens Gröpl; Andreas Hildebrandt; Knut Reinert

Liquid chromatography combined with mass spectrometry (LC-MS) has become the prevalent technology in high-throughput proteomics research. One of the aims of this discipline is to obtain accurate quantitative information about all proteins and peptides in a biological sample. Due to size and complexity of the data generated in these experiments, this problem remains a challenging task requiring sophisticated and efficient computational tools. We propose an algorithm that can quantify even low abundance peptides from LC-MS data. Our approach is flexible and can be applied to preprocessed and raw instrument data. It is based on a combination of the sweep line paradigm with a novel wavelet function tailored to detect isotopic patterns. We evaluate our technique on several data sets of varying complexity and show that we are able to rapidly quantify peptides with high accuracy in a sound algorithmic framework.

Lecture Notes in Computer Science | 2005

Algorithms for the automated absolute quantification of diagnostic markers in complex proteomics samples

Clemens Gröpl; Eva Lange; Knut Reinert; Oliver Kohlbacher; Marc Sturm; Christian G. Huber; Bettina M. Mayr; Christoph L. Klein

HPLC-ESI-MS is rapidly becoming an established standard method for shotgun proteomics. Currently, its major drawbacks are two-fold: quantification is mostly limited to relative quantification and the large amount of data produced by every individual experiment can make manual analysis quite difficult. Here we present a new, combined experimental and algorithmic approach to absolutely quantify proteins from samples with unprecedented precision. We apply the method to the analysis of myoglobin in human blood serum, which is an important diagnostic marker for myocardial infarction. Our approach was able to determine the absolute amount of myoglobin in a serum sample through a series of standard addition experiments with a relative error of 2.5%. Compared to a manual analysis of the same dataset we could improve the precision and conduct it in a fraction of the time needed for the manual analysis. We anticipate that our automatic quantitation method will facilitate further absolute or relative quantitation of even more complex peptide samples. The algorithm was developed using our publically available software framework OpenMS (www.openms.de).

Bioinformatics | 2003

Accelerating screening of 3D protein data with a graph theoretical approach

Cornelius Frömmel; Christoph Gille; Andrean Goede; Clemens Gröpl; Stefan Hougardy; Till Nierhoff; Robert Preissner; Martin Thimm

MOTIVATION The Dictionary of Interfaces in Proteins (DIP) is a database collecting the 3D structure of interacting parts of proteins that are called patches. It serves as a repository, in which patches similar to given query patches can be found. The computation of the similarity of two patches is time consuming and traversing the entire DIP requires some hours. In this work we address the question of how the patches similar to a given query can be identified by scanning only a small part of DIP. The answer to this question requires the investigation of the distribution of the similarity of patches. RESULTS The score values describing the similarity of two patches can roughly be divided into three ranges that correspond to different levels of spatial similarity. Interestingly, the two iso-score lines separating the three classes can be determined by two different approaches. Applying a concept of the theory of random graphs reveals significant structural properties of the data in DIP. These can be used to accelerate scanning the DIP for patches similar to a given query. Searches for very similar patches could be accelerated by a factor of more than 25. Patches with a medium similarity could be found 10 times faster than by brute-force search.

Journal of Computational Biology | 2008

Computational Quantification of Peptides from LC-MS Data

Ole Schulz-Trieglaff; Rene Hussong; Clemens Gröpl; Andreas Leinenbach; Andreas Hildebrandt; Christian G. Huber; Knut Reinert

Liquid chromatography coupled to mass spectrometry (LC-MS) has become a major tool for the study of biological processes. High-throughput LC-MS experiments are frequently conducted in modern laboratories, generating an enormous amount of data per day. A manual inspection is therefore no longer a feasible task. Consequently, there is a need for computational tools that can rapidly provide information about mass, elution time, and abundance of the compounds in a LC-MS sample. We present an algorithm for the detection and quantification of peptides in LC-MS data. Our approach is flexible and independent of the MS technology in use. It is based on a combination of the sweep line paradigm with a novel wavelet function tailored to detect isotopic patterns of peptides. We propose a simple voting schema to use the redundant information in consecutive scans for an accurate determination of monoisotopic masses and charge states. By explicitly modeling the instrument inaccuracy, we are also able to cope with data sets of different quality and resolution. We evaluate our technique on data from different instruments and show that we can rapidly estimate mass, centroid of retention time, and abundance of peptides in a sound algorithmic framework. Finally, we compare the performance of our method to several other techniques on three data sets of varying complexity.

Explore More