Savaş Yıldırım | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Savaş Yıldırım is active.

Explore More

Publication

Featured researches published by Savaş Yıldırım.

Computer Speech & Language | 2009

Learning-based pronoun resolution for Turkish with a comparative evaluation

Yılmaz Kılıçaslan; Edip Serdar Güner; Savaş Yıldırım

The aim of this paper is twofold. On the one hand, it attempts to explore several machine learning models for pronoun resolution in Turkish, a language not sufficiently studied with respect to anaphora resolution and rarely being subjected to machine learning experiments. On the other hand, this paper offers an evaluation of the classification performances of the learning models in order to gain insight into the question of how to match a model to the task at hand. In addition to the expected observation that each model should be tuned to an optimum level of expressive power so as to avoid underfitting and overfitting, the results also suggest that non-linear models properly tuned to avoid overfitting outperform linear ones when applied to the data used in our experiments.

international conference on computational linguistics | 2013

Extraction of part-whole relations from turkish corpora

Tuğba Yıldız; Savaş Yıldırım; Banu Diri

In this work, we present a model for semi-automatically extracting part-whole relations from a Turkish raw text. The model takes a list of manually prepared seeds to induce syntactic patterns and estimates their reliabilities. It then captures the variations of part-whole candidates from the corpus. To get precise meronymic relationships, the candidates are ranked and selected according to their reliability scores. We use and compare some metrics to evaluate the strength of association between a pattern and matched pairs. We conclude with a discussion of the result and show that the model presented here gives promising results for Turkish text.

Pattern Analysis and Applications | 2016

Acquisition of Turkish meronym based on classification of patterns

Tuǧba Yıldız; Banu Diri; Savaş Yıldırım

The identification of semantic relations from a raw text is an important problem in Natural Language Processing. This paper provides semi-automatic pattern-based extraction of part–whole relations. We utilized and adopted some lexico-syntactic patterns to disclose meronymy relation from a Turkish corpus. We applied two different approaches to prepare patterns; one is based on pre-defined patterns that are taken from the literature, second automatically produces patterns by means of bootstrapping method. While pre-defined patterns are directly applied to corpus, other patterns need to be discovered first by taking manually prepared unambiguous seeds. Then, word pairs are extracted by their occurrence in those patterns. In addition, we used statistical selection on global data that is obtaining from all results of entire patterns. It is a whole-by-part matrix on which several association metrics such as information gain, T-score, etc., are applied. We examined how all these approaches improve the system accuracy especially within corpus-based approach and distributional feature of words. Finally, we conducted a variety of experiments with a comparison analysis and showed advantage and disadvantage of the approaches with promising results.

international conference natural language processing | 2014

An Integrated Approach to Automatic Synonym Detection in Turkish Corpus

Tuğba Yıldız; Savaş Yıldırım; Banu Diri

In this study, we designed a model to determine synonymy. Our main assumption is that synonym pairs show similar semantic and dependency relation by the definition. They share same meronym/holonym and hypernym/hyponym relations. Contrary to synonymy, hypernymy and meronymy relations can probably be acquired by applying lexico-syntactic patterns to a big corpus. Such acquisition might be utilized and ease detection of synonymy. Likewise, we utilized some particular dependency relations such as object/subject of a verb, etc. Machine learning algorithms were applied on all these acquired features. The first aim is to find out which dependency and semantic features are the most informative and contribute most to the model. Performance of each feature is individually evaluated with cross validation. The model that combines all features shows promising results and successfully detects synonymy relation. The main contribution of the study is to integrate both semantic and dependency relation within distributional aspect. Second contribution is considered as being first major attempt for Turkish synonym identification based on corpus-driven approach.

international symposium on innovations in intelligent systems and applications | 2012

Association rule based acquisition of hyponym and hypernym relation from a Turkish corpus

Tuğba Yıldız; Savaş Yıldırım

In this paper, we propose a method for the automatic acquisition of hypernym/hyponymy relations from a Turkish raw text. Once the model has extracted prospective hyponyms by using lexico-syntactic patterns, an Apriori algorithm is applied to eliminate faulty hyponyms and increase precision. We show that a model based on a particular lexico-syntactic pattern and association rules for Turkish language can successfully retrieve many is-a relation with high precision.

international conference on computational linguistics | 2012

Corpus-Driven hyponym acquisition for turkish language

Savaş Yıldırım; Tuğba Yıldız

In this study, we propose a method for acquisition of hyponymy relations for the Turkish Language. This integrated method relies on both lexico-syntactic pattern and semantic similarity. Once the model has extracted the items using patterns it applies similarity based elimination of the incorrect ones in order to increase precision. We show that the algorithm based on a particular lexico-syntactic pattern for Turkish language can retrieve many hyponymy relations and also demonstrate that elimination based on semantic similarity gives promising results. We discuss how we measure the similarity between the concepts. The objective is to get better relevance and more precise results. The experiments show that this approach gives successful results with high precision.

language and technology conference | 2009

Pronoun Resolution in Turkish Using Decision Tree and Rule-Based Learning Algorithms

Savaş Yıldırım; Yılmaz Kılıçaslan; Tuğba Yıldız

This paper reports on the results of some pronoun resolution experiments performed by applying a decision tree and a rule-based algorithm on an annotated Turkish text. The text has been compiled mostly from various popular child stories in a semi-automatic way. A knowledge-lean learning model has been devised using only nine most commonly employed features. An evaluation and comparison of the performances achieved with the two different algorithms is offered in terms of the recall, precision and f-measure metrics.

International Journal of Computational Intelligence Systems | 2018

Learning Turkish Hypernymy UsingWord Embeddings

Savaş Yıldırım; Tuğba Yıldız

Recently, Neural Network Language Models have been effectively applied to many types of Natural Language Processing (NLP) tasks. One popular type of tasks is the discovery of semantic and syntactic regularities that support the researchers in building a lexicon. Word embedding representations are notably good at discovering such linguistic regularities. We argue that two supervised learning approaches based on word embeddings can be successfully applied to the hypernym problem, namely, utilizing embedding offsets between word pairs and learning semantic projection to link the words. The offset-based model classifies offsets as hypernym or not. The semantic projection approach trains a semantic transformation matrix that ideally maps a hyponym to its hypernym. A semantic projection model can learn a projection matrix provided that there is a sufficient number of training word pairs. However, we argue that such models tend to learn is-a-particular-hypernym relation rather than to generalize is-a relation. The embeddings are trained by applying both the Continuous Bag-of Words and the Skip-Gram training models using a huge corpus in Turkish text. The main contribution of the study is the development of a novel and efficient architecture that is well-suited to applying word embeddings approaches to the Turkish language domain. We report that both the projection and the offset classification models give promising and novel results for the Turkish Language.

international conference on computational linguistics | 2014

A Knowledge-Poor Approach to Turkish Text Categorization

Savaş Yıldırım

Document categorization is a way of determining a category for a given document. Supervised methods mostly rely on a training data and rich linguistic resources that are either language-specific or generic. This study proposes a knowledge-poor approach to text categorization without using any sets of rules or language specific resources such as part-of-speech tagger or shallow parser. Knowledge-poor here refers to lack of a reasonable amount of background knowledge. The proposed system architecture takes data as-is and simply separates tokens by space. Documents represented in vector space models are used as training data for many machine learning algorithm. We empirically examined and compared a several factors from similarity metrics to learning algorithms in a variety of experimental setups. Although researchers believe that some particular classifiers or metrics are better than others for text categorization, the recent studies disclose that the ranking of the models purely depends on the class, experimental setup and domain as well. The study features extensive evaluation, comparison within a variety of experiments. We evaluate models and similarity metrics for Turkish language as one of the agglutinative language especially within poor-knowledge framework. It is seen that output of the study would be very beneficial for other studies.

language and technology conference | 2013

A Study on Turkish Meronym Extraction Using a Variety of Lexico-Syntactic Patterns

Tuğba Yıldız; Savaş Yıldırım; Banu Diri

In this paper, we applied lexico-syntactic patterns to disclose meronymy relation from a huge Turkish raw text. Once, the system takes a huge raw corpus and extract matched cases for a given pattern, it proposes a list of whole-part pairs depending on their co-occur frequencies. For the purpose, we exploited and compared a list of pattern clusters. The clusters to be examined could fall into three types; general patterns, dictionary-based pattern, and bootstrapped pattern. We evaluated how these patterns improve the system performance especially within corpus-based approach and distributional feature of words. Finally, we discuss all the experiments with a comparison analysis and we showed advantage and disadvantage of the approaches with promising results.

Explore More