Gülşen Eryiğit
Istanbul Technical University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gülşen Eryiğit.
Natural Language Engineering | 2005
Joakim Nivre; Johan Hall; Jens Nilsson; Atanas Chanev; Gülşen Eryiğit; Sandra Kübler; Svetoslav Marinov; Erwin Marsi
Parsing unrestricted text is useful for many language technology applications but requires parsing methods that are both robust and efficient. MaltParser is a language-independent system for data-driven dependency parsing that can be used to induce a parser for a new language from a treebank sample in a simple yet flexible manner. Experimental evaluation confirms that MaltParser can achieve robust, efficient and accurate parsing for a wide range of languages without language-specific enhancements and with rather limited amounts of training data.
Computational Linguistics | 2008
Gülşen Eryiğit; Joakim Nivre; Kemal Oflazer
The suitability of different parsing methods for different languages is an important topic in syntactic parsing. Especially lesser-studied languages, typologically different from the languages for which methods have originally been developed, pose interesting challenges in this respect. This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative, free constituent order language that can be seen as the representative of a wider class of languages of similar type. Our investigations show that morphological structure plays an essential role in finding syntactic relations in such a language. In particular, we show that employing sublexical units called inflectional groups, rather than word forms, as the basic parsing units improves parsing accuracy. We test our claim on two different parsing methods, one based on a probabilistic model with beam search and the other based on discriminative classifiers and a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless of the parsing method. We examine the impact of morphological and lexical information in detail and show that, properly used, this kind of information can improve parsing accuracy substantially. Applying the techniques presented in this article, we achieve the highest reported accuracy for parsing the Turkish Treebank.
north american chapter of the association for computational linguistics | 2016
Maria Pontiki; Dimitris Galanis; Haris Papageorgiou; Ion Androutsopoulos; Suresh Manandhar; Mohammad Al-Smadi; Mahmoud Al-Ayyoub; Yanyan Zhao; Bing Qin; Orphée De Clercq; Veronique Hoste; Marianna Apidianaki; Xavier Tannier; Natalia V. Loukachevitch; Evgeniy Kotelnikov; Núria Bel; Salud María Jiménez-Zafra; Gülşen Eryiğit
This paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015. In its third year, the task provided 19 training and 20 testing datasets for 8 languages and 7 domains, as well as a common evaluation procedure. From these datasets, 25 were for sentence-level and 14 for text-level ABSA; the latter was introduced for the first time as a subtask in SemEval. The task attracted 245 submissions from 29 teams.
Lecture Notes in Computer Science | 2005
Aydın Karaman; Şima Uyar; Gülşen Eryiğit
There is a growing interest in applying evolutionary algorithms to dynamic environments. Different types of changes in the environment benefit from different types of mechanisms to handle the change. In this study, the mechanisms used in literature are categorized into four groups. A new EA approach (MIA) which benefits from the EDA-like approach it employs for re-initializing populations after a change as well as using different change handling mechanisms together is proposed. Experiments are conducted using the 0/1 single knapsack problem to compare MIA with other algorithms and to explore its performance. Promising results are obtained which promote further study. Current research is being done to extend MIA to other problem domains.
conference of the european chapter of the association for computational linguistics | 2014
Gülşen Eryiğit
We present a natural language processing (NLP) platform, namely the “ITU Turkish NLP Web Service” by the natural language processing group of Istanbul Technical University. The platform (available at tools.nlp.itu.edu.tr) operates as a SaaS (Software as a Service) and provides the researchers and the students the state of the art NLP tools in many layers: preprocessing, morphology, syntax and entity recognition. The users may communicate with the platform via three channels: 1. via a user friendly web interface, 2. by file uploads and 3. by using the provided Web APIs within their own codes for constructing higher level applications.
international conference on application of information and communication technologies | 2013
Gokhan Celikkaya; Dilara Torunoğlu; Gülşen Eryiğit
Named Entity Recognition (NER) is a well-studied area in natural language processing (NLP) and the reported results in the literature are generally very high (~>%95) for most of the languages. Today, the focus area of most practical natural language applications (i.e. web mining, sentiment analysis, machine translation) is real natural language data such as Web2.0 or speech data. Nevertheless, the NER task is rarely investigated on this type of data which differs severely from formal written text. In this paper, we present 3 new Turkish data sets from different domains (on this focused area; namely from Twitter, a Speech-to-Text Interface and a Hardware Forum) annotated specifically for NER and report our first results on them. We believe, the paper draws light to the difficulty of these new domains for NER and the possible future work.
genetic and evolutionary computation conference | 2004
Sima Uyar; Sanem Sariel; Gülşen Eryiğit
In this study, a new mechanism that adapts the mutation rate for each locus on the chromosomes, based on feedback obtained from the current population is proposed. Through tests using the one-max problem, it is shown that the proposed scheme improves convergence rate. Further tests are performed using the 4-Peaks and multiple knapsack test problems to compare the performance of the proposed approach with other similar parameter control approaches. A convergence control scheme that provides acceptable perform- ance is chosen to maintain sufficient diversity in the population and implemented for all tested methods to provide fair comparisons. The effects of using a convergence control mechanism are not within the scope of this paper and will be explored in a future study. As a result of the tests, promising results which promote further experimentation are obtained.
international conference on the computer processing of oriental languages | 2006
Gülşen Eryiğit; Joakim Nivre; Kemal Oflazer
Typological diversity among the natural languages of the world poses interesting challenges for the models and algorithms used in syntactic parsing. In this paper, we apply a data-driven dependency parser to Turkish, a language characterized by rich morphology and flexible constituent order, and study the effect of employing varying amounts of morpholexical information on parsing performance. The investigations show that accuracy can be improved by using representations based on inflectional groups rather than word forms, confirming earlier studies. In addition, lexicalization and the use of rich morphological features are found to have a positive effect. By combining all these techniques, we obtain the highest reported accuracy for parsing the Turkish Treebank.
genetic and evolutionary computation conference | 2005
Sima Uyar; Gülşen Eryiğit
Knapsack problems are among the most common problems in literature tackled with evolutionary algorithms (EA). Their major advantage lies in the fact that they are relatively simple to implement while they allow generalizations for a wide range of real world problems. The multi-dimensional knapsack problem (MKP), which belongs to the class of NP-complete combinatorial optimization problems, is one of the variations of the knapsack problem. The MKP has a wide range of real world applications such as cargo loading, selecting projects to fund, budget management, cutting stock, etc. The MKP has been studied quite extensively in the EA community. Due to the constrained nature of the problem, constraint handling techniques gain great importance in the performance of the proposed EA approaches. In this study, the applicability of a generational EA that uses a penalty-based constraint handling technique and a gene locus based, asymmetric, adaptive mutation scheme is explored for the MKP. The effects of the parameters of the explored approach is determined through tests. Further experiments, using large MKP instances from commonly used benchmarks available through the Internet are performed. Comparison tables are given for the performance of the explored approach and other good performing EAs found in literature for the MKP. Results show that performance improves greatly when compared with other penalty-based techniques, but the explored approach is still not the best performer among all. However, unlike the explored technique, the EAs using the other constraint handling techniques require a great amount of extra computational effort and need heuristic information specific to the optimization problem. Based on these observations, and the fact that the performance difference between the explored scheme and the better performers is not too high, research on improving the explored approach is still in progress.
Computational Linguistics | 2017
Mathieu Constant; Gülşen Eryiğit; Johanna Monti; Lonneke van der Plas; Carlos Ramisch; Michael Rosner; Amalia Todirascu
Multiword expressions (MWEs) are a class of linguistic forms spanning conventional word boundaries that are both idiosyncratic and pervasive across different languages. The structure of linguistic processing that depends on the clear distinction between words and phrases has to be re-thought to accommodate MWEs. The issue of MWE handling is crucial for NLP applications, where it raises a number of challenges. The emergence of solutions in the absence of guiding principles motivates this survey, whose aim is not only to provide a focused review of MWE processing, but also to clarify the nature of interactions between MWE processing and downstream applications. We propose a conceptual framework within which challenges and research contributions can be positioned. It offers a shared understanding of what is meant by “MWE processing,” distinguishing the subtasks of MWE discovery and identification. It also elucidates the interactions between MWE processing and two use cases: Parsing and machine translation. Many of the approaches in the literature can be differentiated according to how MWE processing is timed with respect to underlying use cases. We discuss how such orchestration choices affect the scope of MWE-aware systems. For each of the two MWE processing subtasks and for each of the two use cases, we conclude on open issues and research perspectives.