John Sylak-Glassman
Johns Hopkins University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by John Sylak-Glassman.
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in#N# Phonetics, Phonology, and Morphology | 2016
Ryan Cotterell; Christo Kirov; John Sylak-Glassman; David Yarowsky; Jason Eisner; Mans Hulden
The 2016 SIGMORPHON Shared Task was devoted to the problem of morphological reinflection. It introduced morphological datasets for 10 languages with diverse typological characteristics. The shared task drew submissions from 9 teams representing 11 institutions reflecting a variety of approaches to addressing supervised learning of reinflection. For the simplest task, inflection generation from lemmas, the best system averaged 95.56% exact-match accuracy across all languages, ranging from Maltese (88.99%) to Hungarian (99.30%). With the relatively large training datasets provided, recurrent neural network architectures consistently performed best—in fact, there was a significant margin between neural and non-neural approaches. The best neural approach, averaged over all tasks and languages, outperformed the best nonneural one by 13.76% absolute; on individual tasks and languages the gap in accuracy sometimes exceeded 60%. Overall, the results show a strong state of the art, and serve as encouragement for future shared tasks that explore morphological analysis and generation with varying degrees of supervision.
international joint conference on natural language processing | 2015
John Sylak-Glassman; Christo Kirov; David Yarowsky; Roger Que
This paper presents a universal morphological feature schema that represents the finest distinctions in meaning that are expressed by overt, affixal inflectional morphology across languages. This schema is used to universalize data extracted from Wiktionary via a robust multidimensional table parsing algorithm and feature mapping algorithms, yielding 883,965 instantiated paradigms in 352 languages. These data are shown to be effective for training morphological analyzers, yielding significant accuracy gains when applied to Durrett and DeNero’s (2013) paradigm learning framework.
systems and frameworks for computational morphology | 2015
John Sylak-Glassman; Christo Kirov; Matt Post; Roger Que; David Yarowsky
Semantically detailed and typologically-informed morphological analysis that is broadly applicable cross-linguistically has the potential to improve many NLP applications, including machine translation, n-gram language models, information extraction, and co-reference resolution. In this paper, we present a universal morphological feature schema, which is a set of features that represent the finest distinctions in meaning that are expressed by inflectional morphology across languages. We first present the schema’s guiding theoretical principles, construction methodology, and contents. We then present a method of measuring cross-linguistic variability in the semantic distinctions conveyed by inflectional morphology along the multiple dimensions spanned by the schema. This method relies on representing inflected wordforms from many languages in our universal feature space, and then testing for agreement across multiple aligned translations of pivot words in a parallel corpus (the Bible). The results of this method are used to assess the effectiveness of cross-linguistic projection of a multilingual consensus of these fine-grained morphological features, both within and across language families. We find high cross-linguistic agreement for a diverse range of semantic dimensions expressed by inflectional morphology.
arXiv: Computation and Language | 2017
Ryan Cotterell; Christo Kirov; John Sylak-Glassman; Géraldine Walther; Ekaterina Vylomova; Patrick Xia; Manaal Faruqui; Sandra Kübler; David Yarowsky; Jason Eisner; Mans Hulden
The CoNLL-SIGMORPHON 2017 shared task on supervised morphological generation required systems to be trained and tested in each of 52 typologically diverse languages. In sub-task 1, submitted systems were asked to predict a specific inflected form of a given lemma. In sub-task 2, systems were given a lemma and some of its specific inflected forms, and asked to complete the inflectional paradigm by predicting all of the remaining inflected forms. Both sub-tasks included high, medium, and low-resource conditions. Sub-task 1 received 24 system submissions, while sub-task 2 received 3 system submissions. Following the success of neural sequence-to-sequence models in the SIGMORPHON 2016 shared task, all but one of the submissions included a neural component. The results show that high performance can be achieved with small training datasets, so long as models have appropriate inductive bias or make use of additional unlabeled data or synthetic data. However, different biasing and data augmentation resulted in disjoint sets of inflected forms being predicted correctly, suggesting that there is room for future improvement.
language resources and evaluation | 2016
Christo Kirov; John Sylak-Glassman; Roger Que; David Yarowsky
conference of the european chapter of the association for computational linguistics | 2017
Ryan Cotterell; John Sylak-Glassman; Christo Kirov
Proceedings of the Annual Meetings on Phonology | 2014
John Sylak-Glassman
language resources and evaluation | 2018
Christo Kirov; Ryan Cotterell; John Sylak-Glassman; Géraldine Walther; Ekaterina Vylomova; Patrick Xia; Manaal Faruqui; Sebastian J. Mielke; Arya McCarthy; Sandra Kübler; David Yarowsky; Jason Eisner; Mans Hulden
conference of the european chapter of the association for computational linguistics | 2017
Christo Kirov; John Sylak-Glassman; Rebecca Knowles; Ryan Cotterell; Matt Post
Archive | 2014
John Sylak-Glassman