Is this you? Create Your Porfile

Karol Wieloch

Poznań University of Economics

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karol Wieloch is active.

Explore More

Publication

Featured researches published by Karol Wieloch.

Information Retrieval | 2009

On knowledge-poor methods for person name matching and lemmatization for highly inflectional languages

Jakub Piskorski; Karol Wieloch; Marcin Sydow

Web person search is one of the most common activities of Internet users. Recently, a vast amount of work on applying various NLP techniques for person name disambiguation in large web document collections has been reported, where the main focus was on English and few other major languages. This article reports on knowledge-poor methods for tackling person name matching and lemmatization in Polish, a highly inflectional language with complex person name declension paradigm. These methods apply mainly well-established string distance metrics, some new variants thereof, automatically acquired simple suffix-based lemmatization patterns and some combinations of the aforementioned techniques. Furthermore, we also carried out some initial experiments on deploying techniques that utilize the context, in which person names appear. Results of numerous experiments are presented. The evaluation carried out on a data set extracted from a corpus of on-line news articles revealed that achieving lemmatization accuracy figures greater than 90% seems to be difficult, whereas combining string distance metrics with suffix-based patterns results in 97.6–99% accuracy for the name matching task. Interestingly, no significant additional gain could be achieved through integrating some basic techniques, which try to exploit the local context the names appear in. Although our explorations were focused on Polish, we believe that the work presented in this article constitutes practical guidelines for tackling the same problem for other highly inflectional languages with similar phenomena.

business information systems | 2011

Autocompletion for Business Process Modelling

Karol Wieloch; Agata Filipowska; Monika Kaczmarek

This paper presents an idea and prototype of the semantic-based autocompletion mechanism supporting development of business process models. Currently available process modelling tools support business analysts by suggesting elements that may be incorporated in the process, validating modelled processes, providing additional descriptions easing automation, etc. However, these solutions based mainly on syntactic data, disregard proper identification and usage of previously modelled process fragments. The mechanism described in this paper analyses context and annotations of process tasks (also on the semantic level) in order to deliver a list of suggestions for possible successor tasks: process fragments that may complete the model being developed.

language and technology conference | 2009

Comparison of String Distance Metrics for Lemmatisation of Named Entities in Polish

Jakub Piskorski; Marcin Sydow; Karol Wieloch

This paper presents the results of recent experiments on application of string distance metrics to the problem of named entity lemmatisation in Polish. It extends of our work in [1] by introducing new results for organisation names. Furthermore, the results presented here and in [2,3] centering around the same topic were used to make a comparative study of the average usefulness of the numerous examined string distance metrics to lemmatisation of Polish named-entities of various types. In particular, we focus on lemmatisation of country names, organisation names and person names.

meeting of the association for computational linguistics | 2007

Unsupervised Methods of Topical Text Segmentation for Polish

Dominik Flejter; Karol Wieloch; Witold Abramowicz

This paper describes a study on performance of existing unsupervised algorithms of text documents topical segmentation when applied to Polish plain text documents. For performance measurement five existing topical segmentation algorithms were selected, three different Polish test collections were created and seven approaches to text pre-processing were implemented. Based on quantitative results (Pk and WindowDiff metrics) use of specific algorithm was recommended and impact of pre-processing strategies was assessed. Thanks to use of standardized metrics and application of previously described methodology for test collection development, comparative results for Polish and English were also obtained.

ifip world computer congress wcc | 2006

Service interdependencies: insight into use cases for service composition

Witold Abramowicz; Agata Filipowska; Monika Kaczmarek; Tomasz Kaczmarek; Marek Kowalkiewicz; Wojciech Rutkowski; Karol Wieloch; Dominik Zyskowski

The paper analyses several most appealing use cases for Semantic Web services and their composition. They are considered from the perspective of service types, QoS parameters, semantic description and user preferences. We introduce different levels of service composition and discuss implications of the above.

language resources and evaluation | 2006