Anssi Yli-Jyrä
University of Helsinki
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anssi Yli-Jyrä.
international conference natural language processing | 2006
Anssi Yli-Jyrä; Kimmo Koskenniemi
New methods to compile morphophonological two-level rules into finite-state machines are presented. Compilation of the original and new two-level rules and grammars is formulated using an operation called the generalized restriction that constructs a one-tape finite-state automaton over an input alphabet of symbol pairs. The generalized restriction is first used to compile the original two-level formalism where the rules were restricted to single symbol pairs as their centers (i.e. the left-hand sides of the rules). The solution handles also strings of symbol pairs (or regular expressions over the pair alphabet) as centers of two-level rules. Then, the treatment of context conditions is generalized with unions and relative complements etc. Moreover, an extended rule type, the presence requirement, combines the generalized context conditions with center conditions at both sides of the rules. The left-hand side specifies where the rule applies and the right-hand side specifies which of the applications are successful. The original two-level grammars were represented as a separate finite-state machine for each rule and the whole grammar as their intersection. The new methods are used first to redefine this setup, and then to implement a uniform conflict resolution scheme for all rules. The resolution scheme prefers successful and the longest embedded applications of rules, but it treats partially overlapping or explicitly independent applications of rules conjunctively. The composite rules of the original formalism have a marginal status in the new formalism because only identity pairs are allowed in locations where no rule is applicable.
international conference on implementation and application of automata | 2007
Anssi Yli-Jyrä; Kimmo Koskenniemi
Kempe and Karttunen [1] have presented a method that compiles a set of parallel conditional replacement (rewriting) rules into a finite-state transducer. Other, simpler methods exist for single rules or for rules of a restricted type, but they can be used only in restricted situations.
meeting of the association for computational linguistics | 2017
Anssi Yli-Jyrä; Carlos Gómez-Rodríguez
We present a simple encoding for unlabeled noncrossing graphs and show how its latent counterpart helps us to represent several families of directed and undirected graphs used in syntactic and semantic parsing of natural language as context-free languages. The families are separated purely on the basis of forbidden patterns in latent encoding, eliminating the need to differentiate the families of non-crossing graphs in inference algorithms: one algorithm works for all when the search space can be controlled in parser input.
language and automata theory and applications | 2017
Mark-Jan Nederhof; Anssi Yli-Jyrä
The notion of latent-variable probabilistic context-free derivation of syntactic structures is enhanced to allow heads and unrestricted discontinuities. The chosen formalization covers both constituent parsing and dependency parsing. The derivational model is accompanied by an equivalent probabilistic automaton model. By the new framework, one obtains a probability distribution over the space of all discontinuous parses. This lends itself to intrinsic evaluation in terms of perplexity, as shown in experiments.
Shall We Play the Festschrift Game? | 2012
Anssi Yli-Jyrä
Arc contractions in syntactic dependency graphs can be used to decide which graphs are trees. The paper observes that these contractions can be expressed with weighted finite-state transducers (weighted FST) that operate on string-encoded trees. The observation gives rise to a finite-state parsing algorithm that computes the parse forest and extracts the best parses from it. The algorithm is customizable to functional and bilexical dependency parsing, and it can be extended to non-projective parsing via a multi-planar encoding with prior results on high recall. Our experiments support an analysis of projective parsing according to which the worst-case time complexity of the algorithm is quadratic to the sentence length, and linear to the overlapping arcs and the number of functional categories of the arcs. The results suggest several interesting directions towards efficient and high-precision dependency parsing that takes advantage of the flexibility and the demonstrated ambiguity-packing capacity of such a parser.
finite-state methods and natural language processing | 2005
Anssi Yli-Jyrä; Jyrki Niemi
We propose pivotal synchronization languages (PSLs) that represent alignments of parallel processes. PSLs are closely related to synchronization languages [10], but the strings in PSLs are partitioned into sequences of pivots. In the partitioned representation, each pivot gathers and aligns simultaneous process boundaries (starts and terminations). The paper demonstrates that PSLs (and new join operators) provide a unified framework for implementing some independent formalisms. In particular, we show that at least two existing formalisms, generalized synchronization expressions [10] and interleave-disjunction-lockexpressions [8] have PSL-based counterparts. Furthermore, we sketch tentatively a new formalism that adapts the ideas of the operator of generalized restriction [11] to PSLs. All this suggests that the union of these formalisms might be implementable.
Archive | 2005
Antti Arppe; Lauri Carlson; Krister Lindén; Jussi Piitulainen; Mickael Suominen; Martti Vainio; Hanna Westerlund; Anssi Yli-Jyrä
finite-state methods and natural language processing | 2013
Anssi Yli-Jyrä
Archive | 2005
Anssi Yli-Jyrä
Archive | 2004
Anssi Yli-Jyrä