Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maciej Piasecki is active.

Publication


Featured researches published by Maciej Piasecki.


text speech and dialogue | 2007

Automatic selection of heterogeneous syntactic features in semantic similarity of polish nouns

Maciej Piasecki; Stanisław Szpakowicz; Bartosz Broda

We present experiments with a variety of corpus-based measures applied to the problem of constructing semantic similarity functions for Polish nouns. Rich inflection in Polish allows us to acquire useful syntactic features without parsing; morphosyntactic restrictions checked in a large enough window provide sufficiently useful data. A novel feature selection method gives the accuracy of 86% on the WordNet-based synonymy test, an improvement of 5% over the previous results.


canadian conference on artificial intelligence | 2009

Rank-Based Transformation in Measuring Semantic Relatedness

Bartosz Broda; Maciej Piasecki; Stanisław Szpakowicz

Rank weight functions had been shown to increase the accuracy of measures of semantic relatedness for Polish. We present a generalised ranking principle and demonstrate its effect on a range of established measures of semantic relatedness, and on a different language. The results confirm that the generalised transformation method based on ranking brings an improvement over several well-known measures.


text, speech and dialogue | 2006

Effective architecture of the polish tagger

Maciej Piasecki; Grzegorz Godlewski

The large tagset of the IPI PAN Corpus of Polish and the limited size of the learning corpus make construction of a tagger especially demanding The goal of this work is to decompose the overall process of tagging of Polish into subproblems of partial disambiguation Moreover, an architecture of a tagger facilitating this decomposition is proposed The proposed architecture enables easy integration of hand-written tagging rules with the rest of the tagger The architecture is open for different types of classifiers A complete tagger for Polish called TaKIPI is also presented Its configuration, the achieved results (92.55% of accuracy for all tokens, 84.75% for ambiguous tokens in ten-fold test), and considered variants of the architecture are discussed, too.


international multiconference on computer science and information technology | 2008

SuperMatrix: a General tool for lexical semantic knowledge acquisition

Bartosz Broda; Maciej Piasecki

The paper presents the supermatrix system, which was designed as a general tool supporting automatic acquisition of lexical semantic relations from corpora. The construction of the system is discussed, but also examples of different applications showing the potential of supermatrix are given. The core of the system is construction of co-incidence matrices from corpora written in any natural language as the system works on UTF-8 encoding and possesses modular construction. Supermatrix follows the general scheme of distributional methods. Many different matrix transformations and similarity computation methods were implemented in the system. As a result the majority of existing measures of semantic relatedness were re-implemented in the system. The system supports also evaluation of the extracted measures by the tests originating from the idea of the WordNet Based Synonymy Test. In the case of Polish, SuperMatrix includes the implementation of the language of lexico-syntactic constraints delivering means for a kind of shallow syntactic processing. SuperMatrix processes also multiword expressions as lexical units being described and elements of the description. Processing can be distributed as a number of matrix operations were implemented. The system serves huge matrices.


intelligent information systems | 2006

Reductionistic, Tree and Rule Based Tagger for Polish

Maciej Piasecki; Grzegorz Godlewski

The paper presents an approach to tagging of Polish based on the combination of handmade reduction rules and selecting rules acquired by Induction of Decision Trees. The general open architecture of the tagger is presented, where the overall process of tagging is divided into subsequent steps and the overall problem is reduced to subproblems of ambiguity classes. A special language of constraints and the use of constraints as elements of decision trees are described. The results of the experiments performed with the tagger are also presented.


international multiconference on computer science and information technology | 2008

Towards Word Sense Disambiguation of Polish

Dominik Bas; Bartosz Broda; Maciej Piasecki

We compare three different methods of word sense disambiguation applied to the disambiguation of a selected set of 13 Polish words. The selected words express different problems for sense disambiguation. As it is hard to find works for Polish in this area, our goal was to analyse applicability and limitations of known methods in relation to Polish and Polish language resources and tools. The obtained results are very positive, as using limited resources, we achieved the accuracy of sense disambiguation greatly exceeding the baseline of the most frequent sense. For the needs of experiments a small corpus of representative examples was manually collected and annotated with senses drawn from plWordNet. Different representations of context of word occurrences were also experimentally tested. Examples of limitations and advantages of the applied methods are discussed.


intelligent information systems | 2005

A Rule-Based Tagger for Polish Based on Genetic Algorithm

Maciej Piasecki; Bartlomiej Gawel

In the paper an approach to the construction of rule-based morphosyntactic tagger for Polish is proposed. The core of the tagger are modules of rules (classification systems), acquired from the IPI PAN corpus by application of Genetic Algorithms. Each module is specialised in making decisions concerning different parts of a tag (a structure of attributes). The acquired rules are combined with linguistic rules made by hand and memory-based rules acquired also from the corpus. The construction of the tagger and experiments concerning its properties are also presented in the paper.


international multiconference on computer science and information technology | 2008

Automatic acquisition of wordnet relations by the morpho-syntactic patterns extracted from the corpora in Polish

Roman Kurc; Maciej Piasecki

In the paper we present an adaptation of the Espresso algorithm of the extraction of lexical semantic relation to specific requirements of Polish. The introduced changes are of more technical character like the adaptation to the existing Polish language tools, but also we investigate the structure of the patterns that takes into account specific features of Polish as an inflectional language. A new method of the reliability measure computation is proposed. The modified version of the algorithm called Estratto was compared with the more direct reimplementation of Espresso on several corpora of Polish. We tested the influence of different algorithm parameters and different corpora on the received results.


intelligent data engineering and automated learning | 2007

Correction of medical handwriting OCR based on semantic similarity

Bartosz Broda; Maciej Piasecki

In the paper a method of the correction of handwriting Optical Character Recognition (OCR) based on the semantic similarity is presented. Different versions of the extraction of semantic similarity measures from a corpus are analysed, with the best results achieved for the combination of the text window context and Rank Weight Function. An algorithm of the word sequence selection with the high internal similarity is proposed. The method was trained and applied to a corpus of real medical documents written in Polish.


international conference natural language processing | 2008

Classification-Based Filtering of Semantic Relatedness in Hypernymy Extraction

Maciej Piasecki; Stanisław Szpakowicz; Michał Marcińczuk; Bartosz Broda

Manual construction of a wordnet can be facilitated by a system that suggests semantic relations acquired from corpora. Such systems tend to produce many wrong suggestions. We propose a method of filtering a raw list of noun pairs potentially linked by hypernymy, and test it on Polish. The method aims for good recall and sufficient precision. The classifiers work with complex features that give clues on the relation between the nouns. We apply a corpus-based measure of semantic relatedness enhanced with a Rank Weight Function. The evaluation is based on the data in Polish WordNet. The results compare favourably with similar methods applied to English, despite the small size of Polish WordNet.

Collaboration


Dive into the Maciej Piasecki's collaboration.

Top Co-Authors

Avatar

Bartosz Broda

Wrocław University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Grzegorz Godlewski

Wrocław University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michał Marcińczuk

Wrocław University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Adam Radziszewski

Wrocław University of Technology

View shared research outputs
Top Co-Authors

Avatar

Agnieszka Indyka-Piasecka

Wrocław University of Technology

View shared research outputs
Top Co-Authors

Avatar

Bartlomiej Gawel

Wrocław University of Technology

View shared research outputs
Top Co-Authors

Avatar

Dominik Bas

Wrocław University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge