Jens Nilsson
Uppsala University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jens Nilsson.
Natural Language Engineering | 2005
Joakim Nivre; Johan Hall; Jens Nilsson; Atanas Chanev; Gülşen Eryiğit; Sandra Kübler; Svetoslav Marinov; Erwin Marsi
Parsing unrestricted text is useful for many language technology applications but requires parsing methods that are both robust and efficient. MaltParser is a language-independent system for data-driven dependency parsing that can be used to induce a parser for a new language from a treebank sample in a simple yet flexible manner. Experimental evaluation confirms that MaltParser can achieve robust, efficient and accurate parsing for a wide range of languages without language-specific enhancements and with rather limited amounts of training data.
meeting of the association for computational linguistics | 2005
Joakim Nivre; Jens Nilsson
In order to realize the full potential of dependency-based syntactic parsing, it is desirable to allow non-projective dependency structures. We show how a data-driven deterministic dependency parser, in itself restricted to projective structures, can be combined with graph transformation techniques to produce non-projective structures. Experiments using data from the Prague Dependency Treebank show that the combined system can handle non-projective constructions with a precision sufficient to yield a significant improvement in overall parsing accuracy. This leads to the best reported performance for robust non-projective parsing of Czech.
conference on computational natural language learning | 2006
Joakim Nivre; Johan Hall; Jens Nilsson; G"ulc sen Eryiv git; Svetoslav Marinov
We use SVM classifiers to predict the next action of a deterministic parser that builds labeled projective dependency graphs in an incremental fashion. Non-projective dependencies are captured indirectly by projectivizing the training data for the classifiers and applying an inverse transformation to the output of the parser. We present evaluation results and an error analysis focusing on Swedish and Turkish.
empirical methods in natural language processing | 2007
Johan Hall; Jens Nilsson; Joakim Nivre; G"ulsen Eryigit; Beáta Megyesi; Mattias Nilsson; Markus Saers
We describe a two-stage optimization of the MaltParser system for the ten languages in the multilingual track of the CoNLL 2007 shared task on dependency parsing. The first stage consists in tuning a single-parser system for each language by optimizing parameters of the parsing algorithm, the feature model, and the learning algorithm. The second stage consists in building an ensemble system that combines six different parsing strategies, extrapolating from the optimal parameter settings for each language. When evaluated on the official test sets, the ensemble system significantly outperformed the single-parser system and achieved the highest average labeled attachment score of all systems participating in the shared task.
meeting of the association for computational linguistics | 2006
Johan Hall; Joakim Nivre; Jens Nilsson
Deterministic parsing guided by treebank-induced classifiers has emerged as a simple and efficient alternative to more complex models for data-driven parsing. We present a systematic comparison of memory-based learning (MBL) and support vector machines (SVM) for inducing classifiers for deterministic dependency parsing, using data from Chinese, English and Swedish, together with a variety of different feature models. The comparison shows that SVM gives higher accuracy for richly articulated feature models across all languages, albeit with considerably longer training times. The results also confirm that classifier-based deterministic parsing can achieve parsing accuracy very close to the best results reported for more complex parsing models.
meeting of the association for computational linguistics | 2006
Jens Nilsson; Joakim Nivre; Johan Hall
Transforming syntactic representations in order to improve parsing accuracy has been exploited successfully in statistical parsing systems using constituency-based representations. In this paper, we show that similar transformations can give substantial improvements also in data-driven dependency parsing. Experiments on the Prague Dependency Treebank show that systematic transformations of coordinate structures and verb groups result in a 10% error reduction for a deterministic data-driven dependency parser. Combining these transformations with previously proposed techniques for recovering non-projective dependencies leads to state-of-the-art accuracy for the given data set.
international workshop conference on parsing technologies | 2009
Jens Nilsson; Welf L"owe; Johan Hall; Joakim Nivre
Program analysis tools used in software maintenance must be robust and ought to be accurate. Many data-driven parsing approaches developed for natural languages are robust and have quite high accuracy when applied to parsing of software. We show this for the programming languages Java, C/C++, and Python. Further studies indicate that post-processing can almost completely remove the remaining errors. Finally, the training data for instantiating the generic data-driven parser can be generated automatically for formal languages, as opposed to the manually development of treebanks for natural languages. Hence, our approach could improve the robustness of software maintenance tools, probably without showing a significant negative effect on their accuracy.
international conference on program comprehension | 2009
Jens Nilsson; Welf Löwe; Johan Hall; Joakim Nivre
We present a novel approach to extract structural information from source code using state-of-the-art parser technologies for natural languages. The parser technology is robust in the sense that it guarantees to produce some output, entailing that even incomplete or incorrect source code as input will get some kind of analysis. This comes at the expense of possibly assigning a partially incorrect analysis for input free of errors. However, an evaluation on source codes of the Java, Python and C/C++ languages shows that the committed errors are few i.e., our accuracy is close to 100%. The error analysis indicates that the majority of the errors remaining are harmless.
international conference natural language processing | 2008
Jens Nilsson; Joakim Nivre
This study presents new language and treebank independent graph transformations that improve accuracy in data-driven dependency parsing. We show that individual generic graph transformations can increase accuracy across treebanks, but especially when they are combined using established parser combination techniques. The combination experiments also indicate that the presumed best way to combine parsers, using the highest scoring parsers, is not necessarily the best approach.
empirical methods in natural language processing | 2007
Joakim Nivre; Johan Hall; Sandra K"ubler; Ryan T. McDonald; Jens Nilsson; Sebastian Riedel; Deniz Yuret