Özlem Çetinoğlu
University of Stuttgart
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Özlem Çetinoğlu.
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing | 2004
Kemal Oflazer; Özlem Çetinoğlu; Bilge Say
This paper describes a multi-word expression processor for preprocessing Turkish text for various language engineering applications. In addition to the fairly standard set of lexicalized collocations and multi-word expressions such as named-entities, Turkish uses a quite wide range of semi-lexicalized and non-lexicalized collocations. After an overview of relevant aspects of Turkish, we present a description of the multi-word expressions we handle. We then summarize the computational setting in which we employ a series of components for tokenization, morphological analysis, and multi-word expression extraction. We finally present results from runs over a large corpus and a small gold-standard corpus.
meeting of the association for computational linguistics | 2006
Özlem Çetinoğlu; Kemal Oflazer
This paper investigates the use of sublexical units as a solution to handling the complex morphology with productive derivational processes, in the development of a lexical functional grammar for Turkish. Such sublexical units make it possible to expose the internal structure of words with multiple derivations to the grammar rules in a uniform manner. This in turn leads to more succinct and manageable rules. Further, the semantics of the derivations can also be systematically reflected in a compositional way by constructing PRED values on the fly. We illustrate how we use sublexical units for handling simple productive derivational morphology and more interesting cases such as causativization, etc., which change verb valency. Our priority is to handle several linguistic phenomena in order to observe the effects of our approach on both the c-structure and the f-structure representation, and grammar writing, leaving the coverage and evaluation issues aside for the moment.
workshop on computational approaches to code switching | 2016
Özlem Çetinoğlu; Sarah Schulz; Ngoc Thang Vu
This paper addresses challenges of Natural Language Processing (NLP) on non-canonical multilingual data in which two or more languages are mixed. It refers to code-switching which has become more popular in our daily life and therefore obtains an increasing amount of attention from the research community. We report our experience that cov- ers not only core NLP tasks such as normalisation, language identification, language modelling, part-of-speech tagging and dependency parsing but also more downstream ones such as machine translation and automatic speech recognition. We highlight and discuss the key problems for each of the tasks with supporting examples from different language pairs and relevant previous work.
international symposium on computer and information sciences | 2004
Kemal Oflazer; Özlem Çetinoğlu; Orhan Bilgin; Bilge Say
This paper describes a preprocessor for Turkish text that involves various stages of lexical, morphological and multi-word construct processor for preprocessing Turkish text for various language engineering applications. We present the architecture of the system with special emphasis on how various kinds of collocations and other similar multi-word constructs are handled and present an evaluation from a test corpus.
Archive | 2018
Özlem Çetinoğlu; Kemal Oflazer
In this chapter we present a large scale, deep grammar for Turkish based on the Lexical-Functional Grammar formalism. In dealing with the rich derivational morphology of Turkish, we follow an approach based on morphological units that are larger than a morpheme but smaller than a word, in encoding rules of the grammar in order to capture the linguistic phenomena in a more formal and accurate way. Our work covers phrases that are building blocks of a large scale grammar, and also focuses on linguistically—and implementation-wise—more interesting cases such as long distance dependencies and complex predicates.
linguistic annotation workshop | 2017
Özlem Çetinoğlu
We present a code-switching corpus of Turkish-German that is collected by recording conversations of bilinguals. The recordings are then transcribed in two layers following speech and orthography conventions, and annotated with sentence boundaries and intersentential, intrasentential, and intra-word switch points. The total amount of data is 5 hours of speech which corresponds to 3614 sentences. The corpus aims at serving as a resource for speech or text analysis, as well as a collection for linguistic inquiries.
international workshop/conference on parsing technologies | 2015
Agnieszka Faleńska; Anders Björkelund; Özlem Çetinoğlu; Wolfgang Seeker
Supertagging was recently proposed to provide syntactic features for statistical dependency parsing, contrary to its traditional use as a disambiguation step. We conduct a broad range of controlled experiments to compare this specific application of supertagging with another method for providing syntactic features, namely stacking. We find that in this context supertagging is a form of stacking. We furthermore show that (i) a fast parser and a sequence labeler are equally beneficial in supertagging, (ii) supertagging/stacking improve parsing also in a cross-domain setting, and (iii) there are small gains when combining supertagging and stacking, but only if both methods use different tools. The important consideration is therefore not the method but rather the diversity of the tools involved.
international joint conference on natural language processing | 2011
Jennifer Foster; Özlem Çetinoğlu; Joachim Wagner; Joseph Le Roux; Joakim Nivre; Deirdre Hogan; Josef van Genabith
Archive | 2004
Orhan Bilgin; Özlem Çetinoğlu; Kemal Oflazer
empirical methods in natural language processing | 2013
Anders Björkelund; Özlem Çetinoğlu; Richárd Farkas; Thomas Mueller; Wolfgang Seeker