Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pasi Tapanainen is active.

Publication


Featured researches published by Pasi Tapanainen.


conference on applied natural language processing | 1997

A non-projective dependency parser

Pasi Tapanainen; Timo Järvinen

We describe a practical parser for unrestricted dependencies. The parser creates links between words and names the links according to their syntactic functions. We first describe the older Constraint Grammar parser where many of the ideas come from. Then we proceed to describe the central ideas of our new parser. Finally, the parser is evaluated.


conference on applied natural language processing | 2000

Unsupervised Discovery of Scenario-Level Patterns for Information Extraction

Roman Yangarber; Ralph Grishman; Pasi Tapanainen

Information Extraction (IE) systems are commonly based on pattern matching. Adapting an IE system to a new scenario entails the construction of a new pattern base---a time-consuming and expensive process. We have implemented a system for finding patterns automatically from un-annotated text. Starting with a small initial set of seed patterns proposed by the user, the system applies an incremental discovery procedure to identify new patterns. We present experiments with evaluations which show that the resulting patterns exhibit high precision and recall.


conference on applied natural language processing | 1994

Tagging accurately - Don't guess if you know

Pasi Tapanainen; Atro Voutilainen

We discuss combining knowledge-based (or rule-based) and statistical part-of-speech taggers. We use two mature taggers, ENGCG and Xerox Tagger, to independently tag the same text and combine the results to produce a fully disambiguated text. In a 27000 word test sample taken from a previously unseen corpus we achieve 98.5% accuracy. This paper presents the data in detail. We describe the problems we encountered in the course of combining the two taggers and discuss the problem of evaluating taggers.


international conference on computational linguistics | 1992

Compiling and using finite-state syntactic rules

Kimmo Koskenniemi; Pasi Tapanainen; Atro Voutilainen

A language-independent framework for syntactic finite-state parsing is discussed. The article presents a framework, a formalism, a compiler and a parser for grammars written in this formalism. As a substantial example, fragments from a nontrivial finite-state grammar of English are discussed.The linguistic framework of the present approach is based on a surface syntactic tagging scheme by F. Karlsson. This representation is slightly less powerful than phrase structure tree notation, letting some ambiguous constructions be described more concisely.The finite-state rule compiler implements what was briefly sketched by Koskenniemi (1990). It is based on the calculus of finite-state machines. The compiler transforms rules into rule-automata. The run-time parser exploits one of certain alternative strategies in performing the effective intersection of the rule automata and the sentence automaton.Fragments of a fairly comprehensive finite-state grammar of English are presented here, including samples from non-finite constructions as a demonstration of the capacity of the present formalism, which goes far beyond plain disambiguation or part of speech tagging. The grammar itself is directly related to a parser and tagging system for English created as a part of project SIMPR using Karlssons CG (Constraint Grammar) formalism.


conference of the european chapter of the association for computational linguistics | 1993

Ambiguity resolution in a reductionistic parser

Atro Voutilainen; Pasi Tapanainen

We are concerned with dependency-oriented morphosyntactic parsing of running text. While a parsing grammar should avoid introducing structurally unresolvable distinctions in order to optimise on the accuracy of the parser, it also is beneficial for the grammarian to have as expressive a structural representation available as possible. In a reductionistic parsing system this policy may result in considerable ambiguity in the input; however, even massive ambiguity can be tackled efficiently with an accurate parsing description and effective parsing technology.


international colloquium on grammatical inference | 1996

Inducing constraint grammars

Christer Samuelsson; Pasi Tapanainen; Atro Voutilainen

Constraint Grammar rules are induced from corpora. A simple scheme based on local information, i.e., on lexical biases and next-neighbour contexts, extended through the use of barriers, reached 87.3 % precision (1.12 tags/word) at 98.2 % recall. The results compare favourably with other methods that are used for similar tasks although they are by no means as good as the results achieved using the original hand-written rules developed over several years time.


conference on applied natural language processing | 1997

Dependency parser demo

Timo Järvinen; Pasi Tapanainen

1 Introduction We are concerned with surface-syntactic parsing of running text. Our main goal is to describe a syntactic analysis of sentences using dependency links that show the head-dependent relations between words. The new dependency parser 1 (Tapanainen and J~ir-vinen, 1997; J~rvinen and Tapanainen, 1997) belongs to a continuous effort to apply rule-based methods to natural languages. It can been seen as a relative of the Constraint Grammar framework (Karlsson et al., 1995), for many features of the system have been derived from it. The syntactic description in the English Constraint Grammar (ENG-CG) is implicitly dependency oriented; it contains tags for heads and modifiers but not explicit links between them (see Figure 2). Although, the new syntactic formalism differs much from the Constraint Grammars formalisms, the basic rule types of the older formalism have been preserved among the new ones. Also, the rules are independent, and they describe syntax in a piecemeal fashion. The new dependency parser creates explicit links between the elements of the sentence (in Figure 1) while still retaining the shallower representation similar to ENGCG (in Figure 2). The parser applies the ENGTWOL lexicon designed originally by Juha Heikkil£ and Atro Voutilainen. Also, the reliable parts of the ENGCGs morphological disambiguator by Atro Voutilainen are applied. The parser has been tested in Sun workstation and in PCs under Linux. The syntactic analysis is modest in time and space requirements: the size of the process (the syntactic analysis only) is less than 2 MB and it runs in a Pentium 90 MHz machine at the speed of 200 words per second. We have tested the parser on bigger texts to test its usability in corpus linguistic and lexicographic work. By now, some 30 million words have been parsed. 2 The dependency model Our syntactic description can be seen as a formalisa-tion of Tesni~res (1959) original dependency theory. The dependency model adopted to our description differs in various respects from the post-Tesni~rean development of dependency theory, though many of the features are recognised elsewhere. The main features of the parsing system and the adopted dependency theory are: • The basic syntactic element is not a word, but a nucleus. This is related to the internM organisation of the grammar, though the default output shows the dependency links between surface words. • Every element has one and only one head (uniqueness). • The result is a tree. • Functional dependencies …


international conference on computational linguistics | 2000

Automatic acquisition of domain knowledge for Information Extraction

Roman Yangarber; Ralph Grishman; Pasi Tapanainen; Silja Huttunen


Archive | 1997

A dependency parser for English

Timo Järvinen; Pasi Tapanainen


arXiv: Computation and Language | 1998

Towards an implementable dependency grammar

Timo Järvinen; Pasi Tapanainen

Collaboration


Dive into the Pasi Tapanainen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge