Stanisław Szpakowicz
University of Ottawa
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stanisław Szpakowicz.
text speech and dialogue | 2007
Maciej Piasecki; Stanisław Szpakowicz; Bartosz Broda
We present experiments with a variety of corpus-based measures applied to the problem of constructing semantic similarity functions for Polish nouns. Rich inflection in Polish allows us to acquire useful syntactic features without parsing; morphosyntactic restrictions checked in a large enough window provide sufficiently useful data. A novel feature selection method gives the accuracy of 86% on the WordNet-based synonymy test, an improvement of 5% over the previous results.
language resources and evaluation | 2013
Marek Maziarz; Maciej Piasecki; Stanisław Szpakowicz
Wordnets are built of synsets, not of words. A synset consists of words. Synonymy is a relation between words. Words go into a synset because they are synonyms. Later, a wordnet treats words as synonymous because they belong in the same synset
canadian conference on artificial intelligence | 2009
Bartosz Broda; Maciej Piasecki; Stanisław Szpakowicz
international conference natural language processing | 2008
Maciej Piasecki; Stanisław Szpakowicz; Michał Marcińczuk; Bartosz Broda
\ldots
international multiconference on computer science and information technology | 2008
Bartosz Broda; Maciej Piasecki; Stanisław Szpakowicz
Natural Language Communication with Computers | 1978
Stanisław Szpakowicz
… Such circularity, a well-known problem, poses a practical difficulty in wordnet construction, notably when it comes to maintaining consistency. We propose to make a wordnet a net of words or, to be more precise, lexical units. We discuss our assumptions and present their implementation in a steadily growing Polish wordnet. A small set of constitutive relations allows us to construct synsets automatically out of groups of lexical units with the same connectivity. Our analysis includes a thorough comparative overview of systems of relations in several influential wordnets. The additional synset-forming mechanisms include stylistic registers and verb aspect.
text speech and dialogue | 2010
Roman Kurc; Maciej Piasecki; Stanisław Szpakowicz
Rank weight functions had been shown to increase the accuracy of measures of semantic relatedness for Polish. We present a generalised ranking principle and demonstrate its effect on a range of established measures of semantic relatedness, and on a different language. The results confirm that the generalised transformation method based on ranking brings an improvement over several well-known measures.
canadian conference on artificial intelligence | 2009
Maciej Piasecki; Bartosz Broda; Michał Marcińczuk; Stanisław Szpakowicz
Manual construction of a wordnet can be facilitated by a system that suggests semantic relations acquired from corpora. Such systems tend to produce many wrong suggestions. We propose a method of filtering a raw list of noun pairs potentially linked by hypernymy, and test it on Polish. The method aims for good recall and sufficient precision. The classifiers work with complex features that give clues on the relation between the nouns. We apply a corpus-based measure of semantic relatedness enhanced with a Rank Weight Function. The evaluation is based on the data in Polish WordNet. The results compare favourably with similar methods applied to English, despite the small size of Polish WordNet.
international conference on computational linguistics | 1982
Janusz S. Bień; Stanisław Szpakowicz
The construction of a wordnet from scratch requires intelligent software support. An accurate measure of semantic relatedness can be used to extract groups of semantically close words from a corpus. Such groups help a lexicographer make decisions about synset membership and synset placement in the network. We have adapted to Polish the well-known algorithm of Clustering by Committee, and tested it on the largest Polish corpus available. The evaluation by way of a plWordNet-based synonymy test used Polish WordNet, a resource still under development. The results are consistent with a few benchmarks, but not encouraging enough yet to make a wordnet writers support tool immediately useful.
Computational Linguistics - Applications | 2013
Roman Kurc; Maciej Piasecki; Stanisław Szpakowicz
The aim of the paper is to give an idea of methodology and of technical solutions used in the design of an experimental syntax-oriented program to process Polish texts; the program is currently being developed by the author. A classification of Polish words is presented. It is based on the notion of syntactic category and it covers in principle most of the inflexional and syntactic features of words. Polish syntax is to be described by means of a formal grammar; the description takes into account some newer results concerning the syntactic function of particular word classes. The formalism used to describe syntax is the Colmerauers metamorphic grammar. The program will be implemented in PROLOG, the powerful programming language in which metamorphic grammars are directly available. The output of the program will be the surface syntactic structure of each sentence. Next, a subset of Polish is specified. The subset consists of sentences to be processed by the program. Finally, some details of the program are given.