Eşref Adalı
Istanbul Technical University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Eşref Adalı.
international conference natural language processing | 2006
A. Cüneyd Tantuğ; Eşref Adalı; Kemal Oflazer
This paper describes the implementation of a two-level morphological analyzer for the Turkmen Language. Like all Turkic languages, the Turkmen Language is an agglutinative language that has productive inflectional and derivational suffixes. In this work, we implemented a finite-state two-level morphological analyzer for Turkmen Language by using Xerox Finite State Tools.
meeting of the association for computational linguistics | 2007
Ahmet C"uneyd Tantuġ; Eşref Adalı; Kemal Oflazer
We present an approach to MT between Turkic languages and present results from an implementation of a MT system from Turkmen to Turkish. Our approach relies on ambiguous lexical and morphological transfer augmented with target side rule-based repairs and rescoring with statistical language models.
International Journal of Computational Intelligence Systems | 2010
Sevinc Ilhan; Nevcihan Duru; Eşref Adalı
The K-means algorithm is quite sensitive to the cluster centers selected initially and can perform different clusterings depending on these initialization conditions. Within the scope of this study, a new method based on the Fuzzy ART algorithm which is called Improved Fuzzy ART (IFART) is used in the determination of initial cluster centers. By using IFART, better quality clusters are achieved than Fuzzy ART do and also IFART is as good as Fuzzy ART about capable of fast clustering and capability on large scaled data clustering. Consequently, it is observed that, with the proposed method, the clustering operation is completed in fewer steps, that it is performed in a more stable manner by fixing the initialization points and that it is completed with a smaller error margin compared with the conventional K-means.
international conference on intelligent engineering systems | 2012
Bahar Ilgen; Eşref Adalı; A. Cüneyd Tantuğ
Word Sense Disambiguation (WSD) is the task of choosing the most appropriate sense of a word having multiple senses in a given context. Collocational features acquired from the words in neighborship with the ambiguous word are one of the important knowledge sources in this area. This paper explores the effective sets of collocational features in Turkish in order to obtain better Turkish WSD systems. A lexical sample dataset of highly polysemous nouns and verbs has been prepared as the initial step of the work. Several supervised learning algorithms have been tested on this data by supplying different feature sets to select the best performing features for both nouns and verbs in Turkish. Also, we investigated the impact of several collocational features of polysemous words and evaluated the performance of several supervised machine learning algorithms.
international symposium on computer and information sciences | 2013
Bahar Ilgen; Eşref Adalı; A. Cüneyd Tantuğ
In this paper, the effect of different windowing schemes on word sense disambiguation accuracy is presented. Turkish Lexical Sample Dataset has been used in the experiments. We took the samples of ambiguous verbs and nouns of the dataset and used bag-of-word properties as context information. The experi-ments have been repeated for different window sizes based on several machine learning algorithms. We follow 2/3 splitting strategy (2/3 for training, 1/3 for test-ing) and determine the most frequently used words in the training part. After re-moving stop words, we repeated the experiments by using most frequent 100, 75, 50 and 25 content words of the training data. Our findings show that the usage of most frequent 75 words as features improves the accuracy in results for Turkish verbs. Similar results have been obtained for Turkish nouns when we use the most frequent 100 words of the training set. Considering this information, selected al-gorithms have been tested on varying window sizes {30, 15, 10 and 5}. Our find-ings show that Naive Bayes and Functional Tree methods yielded better accuracy results. And the window size \(\pm \)5 gives the best average results both for noun and the verb groups. It is observed that the best results of the two groups are 65.8 and 56 % points above the most frequent sense baseline of the verb and noun groups respectively.
international symposium on computer and information sciences | 2006
A. Cüneyd Tantuğ; Eşref Adalı; Kemal Oflazer
This paper presents a statistical lexical ambiguity resolution method in direct transfer machine translation models in which the target language is Turkish. Since direct transfer MT models do not have full syntactic information, most of the lexical ambiguity resolution methods are not very helpful. Our disambiguation model is based on statistical language models. We have investigated the performances of some statistical language model types and parameters in lexical ambiguity resolution for our direct transfer MT system.
international symposium on innovations in intelligent systems and applications | 2012
Bahar Ilgen; Eşref Adalı; A. Cüneyd Tantuğ
Word Sense Disambiguation (WSD) has become even more important research area in recent years with the widespread usage of Natural Language Processing (NLP) applications. WSD task has two variants: “Lexical Sample” and “All Words” approaches. Lexical Sample approach disambiguates the occurrences of a small sample of target words that were previously selected, while in the latter all the words in a piece of text are disambiguated. In the scope of this work, a Lexical Sample Dataset for Turkish has been prepared. As a first step, highly ambiguous words in Turkish have been selected. Collection of text samples for chosen words has been completed. Five taggers have annotated the word senses. This paper summarizes the step-by-step building-up process of a Lexical Sample Dataset in Turkish and presents the results of some experiments on it.
international conference on computational linguistics | 2014
Gözde Gül İşgüder; Eşref Adalı
Morphological units carry vast amount of semantic information for languages with rich inflectional and derivational morphology. In this paper we show how morphosemantic information available for morphologically rich languages can be used to reduce manual effort in creating semantic resources like PropBank and VerbNet; to increase performance of word sense disambiguation, semantic role labeling and related tasks. We test the consistency of these features in a pilot study for Turkish and show that; 1) Case markers are related with semantic roles and 2) Morphemes that change the valency of the verb follow a predictable pattern.
ieee international conference on computer science and information technology | 2009
Murat Orhun; A. Cüneyd Tantuğ; Eşref Adalı; A. Coskun Sonmez
This paper describes the differences between Uyghur (spoken in Sin Kiang, China) and Turkish Grammar on the sentence level. There are not many researches about natural language processing on Turkic languages except than Turkish. Uyghur language is one of the old and rich language in the Turkic language family. Even though both of these languages belong to the same language family, there are some important differences between them. Because of these reasons, it is not possible to implement a machine translation system between Uyghur and Turkish languages, which works on word by word translation simply. All of the words in the sentences must be analyzed at the morphologic level and define some translation rules, in order to avoid lost original sentences mean. We hope this paper give some contribution for advanced studies to the Uyghur language in Machine Translation and Natural Language processing.
2017 International Conference on Computer Science and Engineering (UBMK) | 2017
İlknur Dönmez; Eşref Adalı
In this paper a novel powerful method for Information Retrieval based Factoid Question Answering system is proposed. A factoid question has exactly one correct answer, and the answer is mostly a named entity like person, date, location etc. A rule-based method for question classification, query formulation and answer processing methods are explored based on our coarse-grained semantic representation for Turkish sentences. “HazırCevap” Question Answering Application which is intended for high-school students to support their education is used to evaluate the proposed method. Testing with a set of questions of HazırCevap dataset, the proposed Question Answering system scored 7.6% for Top5 accuracy, 12.6% for Top10 accuracy and 7.4% for Top20 accuracy which is minimum 7% higher than previ2ous state of the art method.