Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tommi Pirinen is active.

Publication


Featured researches published by Tommi Pirinen.


systems and frameworks for computational morphology | 2009

HFST Tools for Morphology – An Efficient Open-Source Package for Construction of Morphological Analyzers

Krister Lindén; Miikka Silfverberg; Tommi Pirinen

Morphological analysis of a wide range of languages can be implemented efficiently using finite-state transducer technologies. Over the last 30 years, a number of attempts have been made to create tools for computational morphologies. The two main competing approaches have been parallel vs. cascaded rule application. The parallel rule application was originally introduced by Koskenniemi [1] and implemented in tools like TwolC and LexC. Currently many applications of morphologies could use dictionaries encoding the a priori likelihoods of words and expressions as well as the likelihood of relations to other representations or languages. We have made the choice to create open-source tools and language descriptions in order to let as many as possible participate in the effort. The current article presents some of the main tools that we have created such as HFST-LexC, HFST-TwolC and HFST-Compose-Intersect. We evaluate their efficiency in comparison to some similar tools and libraries. In particular, we evaluate them using several full-fledged morphological descriptions. Our tools compare well with similar open source tools, even if we still have some challenges ahead before we can catch up with the commercial tools. We demonstrate that for various reasons a parallel rule approach still seems to be more efficient than a cascaded rule approach when developing finite-state morphologies.


systems and frameworks for computational morphology | 2011

HFST—Framework for Compiling and Applying Morphologies

Krister Lindén; Erik Axelson; Sam Hardwick; Tommi Pirinen; Miikka Silfverberg

HFST–Helsinki Finite-State Technology ( hfst.sf.net ) is a framework for compiling and applying linguistic descriptions with finite-state methods. HFST currently connects some of the most important finite-state tools for creating morphologies and spellers into one open-source platform and supports extending and improving the descriptions with weights to accommodate the modeling of statistical information. HFST offers a path from language descriptions to efficient language applications in key environments and operating systems. HFST also provides an opportunity to exchange transducers between different software providers in order to get the best out of each finite-state library.


systems and frameworks for computational morphology | 2013

HFST — A System for Creating NLP Tools

Krister Lindén; Erik Axelson; Senka Drobac; Sam Hardwick; Juha Kuokkala; Jyrki Niemi; Tommi Pirinen; Miikka Silfverberg

The paper presents and evaluates various NLP tools that have been created using the open source library HFST – Helsinki Finite-State Technology and outlines the minimal extensions that this has required to a pure finite-state system. In particular, the paper describes an implementation and application of Pmatch presented by Karttunen at SFCM 2011.


Computational Linguistics - Applications | 2013

Using HFST for Creating Computational Linguistic Applications

Krister Lindén; Erik Axelson; Senka Drobac; Sam Hardwick; Miikka Silfverberg; Tommi Pirinen

HFST-HelsinkiFinite-StateTechnology (http://hfst.sf.net/) is a framework for compiling and applying linguistic descriptions with finitestatemethods. HFST currently collects some of the most important finite-state tools for creatingmorphologies and spellcheckers into one open-source platform and supports extending and improving the descriptions with weights to accommodate the modeling of statistical information. HFST offers a path from language descriptions to efficient language applications. In this article, we focus on aspects of HFST that are new to the end user, i.e. new tools, new features in existing tools, or new language applications, in addition to some revised algorithms that increase performance.


international conference on computational linguistics | 2014

State-of-the-Art in Weighted Finite-State Spell-Checking

Tommi Pirinen; Krister Lindén

The following claims can be made about finite-state methods for spell-checking: 1 Finite-state language models provide support for morphologically complex languages that word lists, affix stripping and similar approaches do not provide; 2 Weighted finite-state models have expressive power equal to other, state-of-the-art string algorithms used by contemporary spell-checkers; and 3 Finite-state models are at least as fast as other string algorithms for lookup and error correction. In this article, we use some contemporary non-finite-state spell-checking methods as a baseline and perform tests in light of the claims, to evaluate state-of-the-art finite-state spell-checking methods. We verify that finite-state spell-checking systems outperform the traditional approaches for English. We also show that the models for morphologically complex languages can be made to perform on par with English systems.


international multiconference on computer science and information technology | 2010

Building and using existing hunspell dictionaries and TEX hyphenators as finite-state automata

Tommi Pirinen; Krister Lindén

There are numerous formats for writing spell-checkers for open-source systems and there are many descriptions for languages written in these formats. Similarly, for word hyphenation by computer there are TEX rules for many languages. In this paper we demonstrate a method for converting these spell-checking lexicons and hyphenation rule sets into finite-state automata, and present a new finite-state based system for writers tools used in current open-source software such as Firefox, OpenOffice.org and enchant via the spell-checking library voikko.


Archive | 2010

Finite-State Spell-Checking with Weighted Language and Error Models

Tommi Pirinen; Krister Lindén


Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011) | 2011

Modularisation of Finnish Finite-State Language Description – Towards Wide Collaboration in Open Source Development of a Morphological Analyser

Tommi Pirinen


Investigationes Linguisticae | 2010

Creating and Weighting Hunspell Dictionaries as Finite-State Automata

Tommi Pirinen; Krister Lindén


Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16 | 2013

Building an open-source development infrastructure for language technology projects

Sjur N. Moshagen; Tommi Pirinen; Trond Trosterud

Collaboration


Dive into the Tommi Pirinen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jyrki Niemi

University of Helsinki

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge