Johannes Ruhland
University of Jena
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Johannes Ruhland.
Artificial Intelligence Review | 2009
Boris Delibasic; Kathrin Kirchner; Johannes Ruhland; Milos Jovanovic; Milan Vukicevic
Clustering algorithms are well-established and widely used for solving data-mining tasks. Every clustering algorithm is composed of several solutions for specific sub-problems in the clustering process. These solutions are linked together in a clustering algorithm, and they define the process and the structure of the algorithm. Frequently, many of these solutions occur in more than one clustering algorithm. Mostly, new clustering algorithms include frequently occurring solutions to typical sub-problems from clustering, as well as from other machine-learning algorithms. The problem is that these solutions are usually integrated in their algorithms, and that original algorithms are not designed to share solutions to sub-problems outside the original algorithm easily. We propose a way of designing cluster algorithms and to improve existing ones, based on reusable components. Reusable components are well-documented, frequently occurring solutions to specific sub-problems in a specific area. Thus we identify reusable components, first, as solutions to characteristic sub-problems in partitioning cluster algorithms, and, further, identify a generic structure for the design of partitioning cluster algorithms. We analyze some partitioning algorithms (K-means, X-means, MPCK-means, and Kohonen SOM), and identify reusable components in them. We give examples of how new cluster algorithms can be designed based on them.
data and knowledge engineering | 2012
Boris Delibasic; Milan Vukicevic; Milos Jovanovic; Kathrin Kirchner; Johannes Ruhland; Milija Suknovic
We propose an architecture for the design of representative-based clustering algorithms based on reusable components. These components were derived from K-means-like algorithms and their extensions. With the suggested clustering design architecture, it is possible to reconstruct popular algorithms, but also to build new algorithms by exchanging components from original algorithms and their improvements. In this way, the design of a myriad of representative-based clustering algorithms and their fair comparison and evaluation are possible. In addition to the architecture, we show the usefulness of the proposed approach by providing experimental evaluation.
Knowledge and Information Systems | 2013
Milan Vukicevic; Kathrin Kirchner; Boris Delibasic; Milos Jovanovic; Johannes Ruhland; Milija Suknovic
The analysis of microarray data is fundamental to microbiology. Although clustering has long been realized as central to the discovery of gene functions and disease diagnostic, researchers have found the construction of good algorithms a surprisingly difficult task. In this paper, we address this problem by using a component-based approach for clustering algorithm design, for class retrieval from microarray data. The idea is to break up existing algorithms into independent building blocks for typical sub-problems, which are in turn reassembled in new ways to generate yet unexplored methods. As a test, 432 algorithms were generated and evaluated on published microarray data sets. We found their top performers to be better than the original, component-providing ancestors and also competitive with a set of new algorithms recently proposed. Finally, we identified components that showed consistently good performance for clustering microarray data and that should be considered in further development of clustering algorithms.
GfKl | 2008
Boris Delibasic; Kathrin Kirchner; Johannes Ruhland
Most data mining systems follow a data flow and toolbox paradigm. While this modular approach delivers ultimate flexibility, it gives the user almost no guidance on the issue of choosing an efficient combination of algorithms in the current problem context. In the field of Software Engineering the Pattern Based development process has empirically proven its high potential. Patterns provide a broad and generic framework for the solution process in its entirety and are based on equally broad characteristics of the problem. Details of the individual steps are filled in at later stages. Basic research on pattern based thinking has provided us with a list of generally applicable and proven patterns. User interaction in a pattern based approach to data mining will be divided into two steps: (1) choosing a pattern from a generic list based an a handful of characteristics of the problem and later (2) filling in data mining algorithms for the subtasks.
european conference on principles of data mining and knowledge discovery | 1999
Thomas Wittmann; Johannes Ruhland; Matthias Eichholz
Data Mining Algorithms extract patterns from large amounts of data. But these patterns will yield knowledge only if they are interesting, i.e. valid, new, potentially useful, and understandable. Unfortunately, during pattern search most Data Mining Algorithms focus on validity only, which also holds true for Neuro-Fuzzy Systems. In this Paper we introduce a method to enhance the interestingness of a rule base as a whole. In the first step, we aggregate the rule base through amalgamation of adjacent rules and eliminiation of redundant attributes. Supplementing this rather technical approach, we next sort rules with regard to their performance, as measured by their evidence. Finally, we compute reduced evidences, which penalize rules that are very similar to rules with a higher evidence. Rules sorted on reduced evidence are fed into an integrated rulebrowser, to allow for manual rule selection according to personal and situation-dependent preference. This method was applied successfully to two real-life classification problems, the target group selection for a retail bank, and fault diagnosis for a large car manufacturer. Explicit reference is taken to the NEFCLASS algorithm, but the procedure is easily generalized to other systems.
advances in social networks analysis and mining | 2012
Marek Opuszko; Johannes Ruhland
The Semantic Web enables people and computers to interact and exchange information. Based on Semantic Web technologies, different machine learning applications have been designed. Particularly important is the possibility to create complex metadata descriptions for any problem domain, based on pre-defined ontologies. In this paper we evaluate the use of a semantic similarity measure based on pre-defined ontologies as an input for a classification analysis in the context of social network analysis. A link prediction between actors of two real world social networks is performed, which could serve as a recommendation system. The social networks involve different types of relations and nodes. We measure the prediction performance based on a semantic similarity measure as well as traditional approaches. The findings demonstrate that the prediction accuracy based on the semantic similarity is comparable to traditional approaches and shows that data mining on complex social networks using ontology-based metadata can be considered as a very promising approach.
International Journal on Digital Libraries | 2018
Lisa Wenige; Johannes Ruhland
This paper investigates how Linked Open Data (LOD) can be used for recommendations and information retrieval within digital libraries. While numerous studies on both research paper recommender systems and Linked Data-enabled recommender systems have been conducted, no previous attempt has been undertaken to explore opportunities of LOD in the context of search and discovery interfaces. We identify central advantages of Linked Open Data with regard to scientific search and propose two novel recommendation strategies, namely flexible similarity detection and constraint-based recommendations. These strategies take advantage of key characteristics of data that adheres to LOD principles. The viability of Linked Data recommendations was extensively evaluated within the scope of a web-based user experiment in the domain of economics. Findings indicate that the proposed methods are well suited to enhance established search functionalities and are thus offering novel ways of resource access. In addition to that, RDF triples from LOD repositories can complement local bibliographic records that are sparse or of poor quality.
business information systems | 2016
Lisa Wenige; Johannes Ruhland
Recommender systems help consumers to find products online. But because many content-based systems work with insufficient data, recent research has focused on enhancing item feature information with data from the Linked Open Data cloud. Linked Data recommender systems are usually bound to a predefined set of item features and offer limited opportunities to tune the recommendation model to individual needs. The paper addresses this research gap by introducing the prototype SKOS Recommender (SKOSRec), which produces scalable on-the-fly recommendations through SPARQL-like queries from Linked Data repositories. The SKOSRec query language enables users to obtain constraint-based, aggregation-based and cross-domain recommendations, such that results can be adapted to specific business or customer requirements.
conference on the future of the internet | 2015
Lisa Wenige; Johannes Ruhland
Recommender systems are an integral part of todays internet landscape. Recently the enhancement of recommendation services through Linked Open Data (LOD) became a new research area. The ever growing amount of structured data on the web can be used as additional background information for recommender systems. But current approaches in Linked Data recommender systems (LDRS) miss out on an adequate item feature representation in their prediction model and an efficient processing of LOD resources. In this paper, we present a scalable Linked Data recommender system that calculates preferences on multiple property dimensions. The system achieves scalability through parallelization of property-specific rating prediction on a MapReduce framework. Separate prediction results are summarized through a stacking technique. Evaluation results show an increased performance both in terms of accuracy and scalability.
Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013
Thomas Fischer; Johannes Ruhland
Planning is an important reasoning capability for the Semantic Web (SW), however, it is currently not fully integrated into existing SW standards. We present a novel query language to represent objective functions and constraints through combinations of conjunctive queries which are written in terms of roles and concepts of a Description Logic (DL). Furthermore, we consider a compiler that instantiates satisfiability modulo theories problems from such a query and which responds the final solution as DL role assertions. We outline our approach on a NP-hard production planning problem that requires computation of the optimal assignment of tasks to stations based on different numerical and logical restrictions.