Łukasz Bolikowski | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Łukasz Bolikowski is active.

Explore More

Publication

Featured researches published by Łukasz Bolikowski.

international conference theory and practice digital libraries | 2013

Large Scale Citation Matching Using Apache Hadoop

Mateusz Fedoryszak; Dominika Tkaczyk; Łukasz Bolikowski

During the process of citation matching links from bibliography entries to referenced publications are created. Such links are indicators of topical similarity between linked texts, are used in assessing the impact of the referenced document and improve navigation in the user interfaces of digital libraries. In this paper we present a citation matching method and show how to scale it up to handle great amounts of data using appropriate indexing and a MapReduce paradigm in the Hadoop environment.

Semantic Web Evaluation Challenges | 2015

Extracting Contextual Information from Scientific Literature Using CERMINE System

Dominika Tkaczyk; Łukasz Bolikowski

CERMINE is a comprehensive open source system for extracting structured metadata and references from born-digital scientific literature. Among other information, the system is able to extract information related to the context the article was written in, such as the authors and their affiliations, the relations between them or references to other articles. Extracted information is presented in a structured, machine-readable form. CERMINE is based on a modular workflow, whose loosely coupled architecture allows for individual components evaluation and adjustment, enables effortless improvements and replacements of independent parts of the algorithm and facilitates future architecture expanding. The implementation of the workflow is based mostly on supervised and unsupervised machine-learning techniques, which simplifies the procedure of adapting the system to new document layouts and styles. In this paper we outline the overall workflow architecture, describe key aspects of the system implementation, provide details about training and adjusting of individual algorithms, and finally report how CERMINE was used for extracting contextual information from scientific articles in PDF format in the context of ESWC 2015 Semantic Publishing Challenge. CERMINE system is available under an open-source licence and can be accessed at http://cermine.ceon.pl.

Intelligent Tools for Building a Scientific Information Platform | 2013

Data Model for Analysis of Scholarly Documents in the MapReduce Paradigm

Adam Kawa; Łukasz Bolikowski; Artur Czeczko; Piotr Jan Dendek; Dominika Tkaczyk

At CeON ICM UW we are in possession of a large collection of scholarly documents that we store and process using MapReduce paradigm. One of the main challenges is to design a simple, but effective data model that fits various data access patterns and allows us to perform diverse analysis efficiently. In this paper, we will describe the organization of our data and explain how this data is accessed and processed by open-source tools from Apache Hadoop Ecosystem.

Intelligent Tools for Building a Scientific Information Platform | 2013

Author disambiguation in the YADDA2 software platform

Piotr Jan Dendek; Mariusz Wojewódzki; Łukasz Bolikowski

SYNAT platform powered by the YADDA2 architecture has been extended with the Author Disambiguation Framework and the Query Framework. The former framework clusters occurrences of contributor names into identities of authors, the latter answers queries about authors and documents written by them. This paper presents an outline of the disambiguation algorithms, implementation of the query framework, integration into the platform and performance evaluation of the solution.

Archive | 2014

Theory and Practice of Digital Libraries -- TPDL 2013 Selected Workshops

Łukasz Bolikowski; Vittore Casarosa; Paula Goodale; Nikos Houssos; Paolo Manghi; Jochen Schirrwagen

This article describes a case study of a small research group collecting and managing data from a pair of long-running experimental campaigns, detailing the data management and publication processes in place at the time of the experiments. It highlights the reasons why publications became disconnected from their underlying data in the past, and identifies the new processes and principles which aim to address these issues.

Intelligent Tools for Building a Scientific Information Platform | 2014

Content Analysis of Scientific Articles in Apache Hadoop Ecosystem

Piotr Jan Dendek; Artur Czeczko; Mateusz Fedoryszak; Adam Kawa; Piotr Wendykier; Łukasz Bolikowski

Content Analysis System (CoAnSys) is a research framework for mining scientific publications using Apache Hadoop. This article describes the algorithms currently implemented in CoAnSys including classification, categorization and citation matching of scientific publications. The size of the input data classifies these algorithms in the range of big data problems, which can be efficiently solved on Hadoop clusters.

Intelligent Tools for Building a Scientific Information Platform | 2013

Hierarchical, Multi-label Classification of Scholarly Publications: Modifications of ML-KNN Algorithm

Michał Łukasik; Tomasz Kuśmierczyk; Łukasz Bolikowski; Hung Son Nguyen

One of the common problems when dealing with digital libraries is lack of classification codes in some of the documents. In the following publication we deal with this problem in a multi-label, hierarchical case of Mathematics Subject Classification System. We develop modifications of ML-KNN algorithm and show how they improve results given by the algorithm on example of Springer textual data.

Intelligent Tools for Building a Scientific Information Platform | 2013

Methodology for evaluating citation parsing and matching

Mateusz Fedoryszak; Łukasz Bolikowski; Dominika Tkaczyk; Krzysztof Wojciechowski

Bibliographic references between scholarly publications contain valuable information for researchers and developers involved with digital repositories. They are indicators of topical similarity between linked texts, impact of the referenced document, and improve navigation in user interfaces of digital libraries. Consequently, several approaches to extraction, parsing and resolving said references have been proposed to date. In this paper we develop a methodology for evaluating parsing and matching algorithms and choosing the most appropriate one for a document collection at hand. We apply the methodology for evaluating reference parsing and matching module of the YADDA2 software platform.

Interfaces and Free Boundaries | 2010

The Neumann problem in an irregular domain

Łukasz Bolikowski; Maria Gokieli; Nicolas Varchon

We ask the question of patterns’ stability for the reaction-diffusion equation with Neumann boundary conditions in an irregular domain in R , N ≥ 2, the model example being two convex regions connected by a small ’hole’ in their boundaries. By patterns we mean solutions having an interface, i.e. a transition layer between two constants. It is well known that in 1D domains and in many 2D domains ’patterns’ are unstable for this equation. We show that, unlike the 1D case, but as in 2D dumbbell domains, stable patterns exist. In a more general way, we prove invariance of stability properties for steady states when a sequence of domains Ωn converges to our limit domain Ω in the sense of Mosco. We illustrate the theoretical results by numerical simulations of evolving and persisting interfaces. ∗To whom correspondence should be addressed

italian research conference on digital library management systems | 2014

Information inference in scholarly communication infrastructures: the OpenAIREplus project experience

Mateusz Kobos; Łukasz Bolikowski; Marek Horst; Paolo Manghi; Natalia Manola; Jochen Schirrwagen

The Information Inference Framework presented in this paper provides a general-purpose suite of tools enabling the definition and execution of flexible and reliable data processing workflows whose nodes offer application-specific processing capabilities. The IIF is designed for the purpose of processing big data, and it is implemented on top of Apache Hadoop-related technologies to cope with scalability and high-performance execution requirements. As a proof of concept we will describe how the framework is used to support linking and contextualization services in the context of the OpenAIRE infrastructure for scholarly communication.

Explore More