Stanisław Szpakowicz

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stanisław Szpakowicz is active.

Explore More

Publication

Featured researches published by Stanisław Szpakowicz.

text speech and dialogue | 2007

Automatic selection of heterogeneous syntactic features in semantic similarity of polish nouns

Maciej Piasecki; Stanisław Szpakowicz; Bartosz Broda

We present experiments with a variety of corpus-based measures applied to the problem of constructing semantic similarity functions for Polish nouns. Rich inflection in Polish allows us to acquire useful syntactic features without parsing; morphosyntactic restrictions checked in a large enough window provide sufficiently useful data. A novel feature selection method gives the accuracy of 86% on the WordNet-based synonymy test, an improvement of 5% over the previous results.

language resources and evaluation | 2013

The chicken-and-egg problem in wordnet design: synonymy, synsets and constitutive relations

Marek Maziarz; Maciej Piasecki; Stanisław Szpakowicz

Wordnets are built of synsets, not of words. A synset consists of words. Synonymy is a relation between words. Words go into a synset because they are synonyms. Later, a wordnet treats words as synonymous because they belong in the same synset

canadian conference on artificial intelligence | 2009

Rank-Based Transformation in Measuring Semantic Relatedness

Bartosz Broda; Maciej Piasecki; Stanisław Szpakowicz

international conference natural language processing | 2008

Classification-Based Filtering of Semantic Relatedness in Hypernymy Extraction

Maciej Piasecki; Stanisław Szpakowicz; Michał Marcińczuk; Bartosz Broda

\ldots

international multiconference on computer science and information technology | 2008

Sense-based clustering of Polish nouns in the extraction of semantic relatedness

Bartosz Broda; Maciej Piasecki; Stanisław Szpakowicz

Natural Language Communication with Computers | 1978

Syntactic Analysis of Written Polish

Stanisław Szpakowicz

… Such circularity, a well-known problem, poses a practical difficulty in wordnet construction, notably when it comes to maintaining consistency. We propose to make a wordnet a net of words or, to be more precise, lexical units. We discuss our assumptions and present their implementation in a steadily growing Polish wordnet. A small set of constitutive relations allows us to construct synsets automatically out of groups of lexical units with the same connectivity. Our analysis includes a thorough comparative overview of systems of relations in several influential wordnets. The additional synset-forming mechanisms include stylistic registers and verb aspect.

text speech and dialogue | 2010

Automatic acquisition of wordnet relations by distributionally supported morphological patterns extracted from Polish corpora

Roman Kurc; Maciej Piasecki; Stanisław Szpakowicz

Rank weight functions had been shown to increase the accuracy of measures of semantic relatedness for Polish. We present a generalised ranking principle and demonstrate its effect on a range of established measures of semantic relatedness, and on a different language. The results confirm that the generalised transformation method based on ranking brings an improvement over several well-known measures.

canadian conference on artificial intelligence | 2009

The WordNet Weaver: Multi-criteria Voting for Semi-automatic Extension of a Wordnet

Maciej Piasecki; Bartosz Broda; Michał Marcińczuk; Stanisław Szpakowicz

Manual construction of a wordnet can be facilitated by a system that suggests semantic relations acquired from corpora. Such systems tend to produce many wrong suggestions. We propose a method of filtering a raw list of noun pairs potentially linked by hypernymy, and test it on Polish. The method aims for good recall and sufficient precision. The classifiers work with complex features that give clues on the relation between the nouns. We apply a corpus-based measure of semantic relatedness enhanced with a Rank Weight Function. The evaluation is based on the data in Polish WordNet. The results compare favourably with similar methods applied to English, despite the small size of Polish WordNet.

international conference on computational linguistics | 1982

Toward a parsing method for free word order languages

Janusz S. Bień; Stanisław Szpakowicz

The construction of a wordnet from scratch requires intelligent software support. An accurate measure of semantic relatedness can be used to extract groups of semantically close words from a corpus. Such groups help a lexicographer make decisions about synset membership and synset placement in the network. We have adapted to Polish the well-known algorithm of Clustering by Committee, and tested it on the largest Polish corpus available. The evaluation by way of a plWordNet-based synonymy test used Polish WordNet, a resource still under development. The results are consistent with a few benchmarks, but not encouraging enough yet to make a wordnet writers support tool immediately useful.

Computational Linguistics - Applications | 2013

Automatic Construction of a Dynamic Thesaurus for Proper Names

Roman Kurc; Maciej Piasecki; Stanisław Szpakowicz

The aim of the paper is to give an idea of methodology and of technical solutions used in the design of an experimental syntax-oriented program to process Polish texts; the program is currently being developed by the author. A classification of Polish words is presented. It is based on the notion of syntactic category and it covers in principle most of the inflexional and syntactic features of words. Polish syntax is to be described by means of a formal grammar; the description takes into account some newer results concerning the syntactic function of particular word classes. The formalism used to describe syntax is the Colmerauers metamorphic grammar. The program will be implemented in PROLOG, the powerful programming language in which metamorphic grammars are directly available. The output of the program will be the surface syntactic structure of each sentence. Next, a subset of Polish is specified. The subset consists of sentences to be processed by the program. Finally, some details of the program are given.

Explore More