Thorsten Meinl | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thorsten Meinl is active.

Explore More

Publication

Featured researches published by Thorsten Meinl.

4th Annual Industrial Simulation Conference (ISC) | 2008

KNIME: The Konstanz Information Miner

Michael R. Berthold; Nicolas Cebron; Fabian Dill; Thomas R. Gabriel; Tobias Kötter; Thorsten Meinl; Peter Ohl; Christoph Sieb; Kilian Thiel; Bernd Wiswedel

The Konstanz Information Miner is a modular environment, which enables easy visual assembly and interactive execution of a data pipeline. It is designed as a teaching, research and collaboration platform, which enables simple integration of new algorithms and tools as well as data manipulation or visualization methods in the form of new modules or nodes. In this paper we describe some of the design aspects of the underlying architecture and briefly sketch how new nodes can be incorporated.

Sigkdd Explorations | 2009

KNIME - the Konstanz information miner: version 2.0 and beyond

Michael R. Berthold; Nicolas Cebron; Fabian Dill; Thomas R. Gabriel; Tobias Kötter; Thorsten Meinl; Peter Ohl; Kilian Thiel; Bernd Wiswedel

The Konstanz Information Miner is a modular environment, which enables easy visual assembly and interactive execution of a data pipeline. It is designed as a teaching, research and collaboration platform, which enables simple integration of new algorithms and tools as well as data manipulation or visualization methods in the form of new modules or nodes. In this paper we describe some of the design aspects of the underlying architecture, briey sketch how new nodes can be incorporated, and highlight some of the new features of version 2.0.

european conference on machine learning | 2005

A quantitative comparison of the subgraph miners mofa, gspan, FFSM, and gaston

Marc Wörlein; Thorsten Meinl; Ingrid Fischer; Michael Philippsen

Several new miners for frequent subgraphs have been published recently. Whereas new approaches are presented in detail, the quantitative evaluations are often of limited value: only the performance on a small set of graph databases is discussed and the new algorithm is often only compared to a single competitor based on an executable. It remains unclear, how the algorithms work on bigger/other graph databases and which of their distinctive features is best suited for which database. We have re-implemented the subgraph miners MoFa, gSpan, FFSM, and Gaston within a common code base and with the same level of programming expertise and optimization effort. This paper presents the results of a comparative benchmarking that ran the algorithms on a comprehensive set of graph databases.

BMC Bioinformatics | 2013

KNIME-CDK: Workflow-driven cheminformatics

Stephan Beisken; Thorsten Meinl; Bernd Wiswedel; Luis F. de Figueiredo; Michael R. Berthold; Christoph Steinbeck

BackgroundCheminformaticians have to routinely process and analyse libraries of small molecules. Among other things, that includes the standardization of molecules, calculation of various descriptors, visualisation of molecular structures, and downstream analysis. For this purpose, scientific workflow platforms such as the Konstanz Information Miner can be used if provided with the right plug-in. A workflow-based cheminformatics tool provides the advantage of ease-of-use and interoperability between complementary cheminformatics packages within the same framework, hence facilitating the analysis process.ResultsKNIME-CDK comprises functions for molecule conversion to/from common formats, generation of signatures, fingerprints, and molecular properties. It is based on the Chemistry Development Toolkit and uses the Chemical Markup Language for persistence. A comparison with the cheminformatics plug-in RDKit shows that KNIME-CDK supports a similar range of chemical classes and adds new functionality to the framework. We describe the design and integration of the plug-in, and demonstrate the usage of the nodes on ChEBI, a library of small molecules of biological interest.ConclusionsKNIME-CDK is an open-source plug-in for the Konstanz Information Miner, a free workflow platform. KNIME-CDK is build on top of the open-source Chemistry Development Toolkit and allows for efficient cross-vendor structural cheminformatics. Its ease-of-use and modularity enables researchers to automate routine tasks and data analysis, bringing complimentary cheminformatics functionality to the workflow environment.

Proceedings of the 1st international workshop on open source data mining | 2005

MoSS: a program for molecular substructure mining

Christian Borgelt; Thorsten Meinl; Michael R. Berthold

Molecular substructure mining is currently an intensively studied research area. In this paper we present an implementation of an algorithm for finding frequent substructures in a set of molecules, which may also be used to find substructures that discriminate well between a focus and a complement group. In addition to the basic algorithm, we discuss advanced pruning techniques, demonstrating their effectiveness with experiments on two publicly available molecular data sets, and briefly mention some other extensions.

Electronic Communication of The European Association of Software Science and Technology | 2007

The ParMol Package for Frequent Subgraph Mining

Thorsten Meinl; Marc Wörlein; Olga Urzova; Ingrid Fischer; Michael Philippsen

Mining for frequent subgraphs in a graph database has become a popular topic in the last years. Algorithms to solve this problem are used in chemoinformatics to find common molecular fragments in a database of molecules represented as two-dimensional graphs. However, the search process in arbitrary graph structures includes costly graph and subgraph isomorphism tests. In our ParMol package we have implemented four of the most popular frequent subgraph miners using a common infrastructure: MoFa, gSpan, FFSM, and Gaston. Besides the pure re-implementation, we have added additional functionality to some of the algorithms like parallel search, mining directed graphs, and mining in one big graph instead of a graph database. Also a 2D-visualizer for molecules has been integrated.

Journal of Cheminformatics | 2013

Get your chemistry right with KNIME

Thorsten Meinl; Gregory A. Landrum

KNIME (Konstanz Information Miner, [1]) is a user-friendly and comprehensive open-source data integration, processing, analysis, and exploration platform. From day one, KNIME has been developed using rigorous software engineering practices and is used by professionals in both industry and academia in over 60 countries. In the presentation we will show some of the new features in KNIME 2.6 and give an outlook to KNIME 2.7. We also present recent developments in the KNIME Community Contributions [2], where research groups can easily provide their KNIME extensions to the community. We will focus on the freely available chemistry extensions - especially RDKit [3] - and demonstrate their usage in real-world workflows.

systems, man and cybernetics | 2006

Mining Molecular Datasets on Symmetric Multiprocessor Systems

Thorsten Meinl; Marc Wörlein; Ingrid Fischer; Michael Philippsen

Although in the last few years about a dozen sophisticated algorithms for mining frequent fragments in molecular databases have been proposed, searching big databases with 100,000 compounds and more is still a time-consuming process. Even the currently fastest algorithms like gSpan, FFSM, Gaston, or MoFa require hours to complete their tasks. This paper presents thread-based parallel versions of MoFa [5] and gSpan [26] that achieve speedups up to 11 on a shared-memory SMP system using 12 processors. We discuss the design space of the parallelization, the results, and the obstacles that are caused by the irregular search space and by the current state of Java technology.

Journal of Chemical Information and Modeling | 2011

Maximum-Score Diversity Selection for Early Drug Discovery

Thorsten Meinl; Claude Ostermann; Michael R. Berthold

Diversity selection is a common task in early drug discovery. One drawback of current approaches is that usually only the structural diversity is taken into account, therefore, activity information is ignored. In this article, we present a modified version of diversity selection, which we term Maximum-Score Diversity Selection, that additionally takes the estimated or predicted activities of the molecules into account. We show that finding an optimal solution to this problem is computationally very expensive (it is NP-hard), and therefore, heuristic approaches are needed. After a discussion of existing approaches, we present our new method, which is computationally far more efficient but at the same time produces comparable results. We conclude by validating these theoretical differences on several data sets.

systems, man and cybernetics | 2004

Advanced pruning strategies to speed up mining closed molecular fragments

Christian Borgelt; Thorsten Meinl; Michael R. Berthold

In years, several algorithms for mining frequent subgraphs in graph databases have been proposed, with a major application area being the discovery of frequent substructures of biomolecules. Unfortunately, most of these algorithms still struggle with fairly long execution times if larger substructures or molecular fragments are desired. We describe two advanced pruning strategies - equivalent sibling pruning and perfect extension pruning - that can be used to speed up the MoFa algorithm (introduced in C. Borgelt and M.R. Berthold, (2002)) in the search for closed molecular fragments, as we demonstrate with experiments on the NCIs HIV database.

Explore More