Tommi Pirinen
University of Helsinki
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tommi Pirinen.
systems and frameworks for computational morphology | 2009
Krister Lindén; Miikka Silfverberg; Tommi Pirinen
Morphological analysis of a wide range of languages can be implemented efficiently using finite-state transducer technologies. Over the last 30 years, a number of attempts have been made to create tools for computational morphologies. The two main competing approaches have been parallel vs. cascaded rule application. The parallel rule application was originally introduced by Koskenniemi [1] and implemented in tools like TwolC and LexC. Currently many applications of morphologies could use dictionaries encoding the a priori likelihoods of words and expressions as well as the likelihood of relations to other representations or languages. We have made the choice to create open-source tools and language descriptions in order to let as many as possible participate in the effort. The current article presents some of the main tools that we have created such as HFST-LexC, HFST-TwolC and HFST-Compose-Intersect. We evaluate their efficiency in comparison to some similar tools and libraries. In particular, we evaluate them using several full-fledged morphological descriptions. Our tools compare well with similar open source tools, even if we still have some challenges ahead before we can catch up with the commercial tools. We demonstrate that for various reasons a parallel rule approach still seems to be more efficient than a cascaded rule approach when developing finite-state morphologies.
systems and frameworks for computational morphology | 2011
Krister Lindén; Erik Axelson; Sam Hardwick; Tommi Pirinen; Miikka Silfverberg
HFST–Helsinki Finite-State Technology ( hfst.sf.net ) is a framework for compiling and applying linguistic descriptions with finite-state methods. HFST currently connects some of the most important finite-state tools for creating morphologies and spellers into one open-source platform and supports extending and improving the descriptions with weights to accommodate the modeling of statistical information. HFST offers a path from language descriptions to efficient language applications in key environments and operating systems. HFST also provides an opportunity to exchange transducers between different software providers in order to get the best out of each finite-state library.
systems and frameworks for computational morphology | 2013
Krister Lindén; Erik Axelson; Senka Drobac; Sam Hardwick; Juha Kuokkala; Jyrki Niemi; Tommi Pirinen; Miikka Silfverberg
The paper presents and evaluates various NLP tools that have been created using the open source library HFST – Helsinki Finite-State Technology and outlines the minimal extensions that this has required to a pure finite-state system. In particular, the paper describes an implementation and application of Pmatch presented by Karttunen at SFCM 2011.
Computational Linguistics - Applications | 2013
Krister Lindén; Erik Axelson; Senka Drobac; Sam Hardwick; Miikka Silfverberg; Tommi Pirinen
HFST-HelsinkiFinite-StateTechnology (http://hfst.sf.net/) is a framework for compiling and applying linguistic descriptions with finitestatemethods. HFST currently collects some of the most important finite-state tools for creatingmorphologies and spellcheckers into one open-source platform and supports extending and improving the descriptions with weights to accommodate the modeling of statistical information. HFST offers a path from language descriptions to efficient language applications. In this article, we focus on aspects of HFST that are new to the end user, i.e. new tools, new features in existing tools, or new language applications, in addition to some revised algorithms that increase performance.
international conference on computational linguistics | 2014
Tommi Pirinen; Krister Lindén
The following claims can be made about finite-state methods for spell-checking: 1 Finite-state language models provide support for morphologically complex languages that word lists, affix stripping and similar approaches do not provide; 2 Weighted finite-state models have expressive power equal to other, state-of-the-art string algorithms used by contemporary spell-checkers; and 3 Finite-state models are at least as fast as other string algorithms for lookup and error correction. In this article, we use some contemporary non-finite-state spell-checking methods as a baseline and perform tests in light of the claims, to evaluate state-of-the-art finite-state spell-checking methods. We verify that finite-state spell-checking systems outperform the traditional approaches for English. We also show that the models for morphologically complex languages can be made to perform on par with English systems.
international multiconference on computer science and information technology | 2010
Tommi Pirinen; Krister Lindén
There are numerous formats for writing spell-checkers for open-source systems and there are many descriptions for languages written in these formats. Similarly, for word hyphenation by computer there are TEX rules for many languages. In this paper we demonstrate a method for converting these spell-checking lexicons and hyphenation rule sets into finite-state automata, and present a new finite-state based system for writers tools used in current open-source software such as Firefox, OpenOffice.org and enchant via the spell-checking library voikko.
Archive | 2010
Tommi Pirinen; Krister Lindén
Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011) | 2011
Tommi Pirinen
Investigationes Linguisticae | 2010
Tommi Pirinen; Krister Lindén
Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16 | 2013
Sjur N. Moshagen; Tommi Pirinen; Trond Trosterud