Senem Kumova Metin
İzmir University of Economics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Senem Kumova Metin.
Computers & Electrical Engineering | 2015
Ilker Korkmaz; Senem Kumova Metin; Alper Gurek; Caner Gur; Cagri Gurakin; Mustafa Akdeniz
A cloud based and Android supported scalable home automation system is proposed.The mechanism relies on the distributed Google Cloud Platform.Google Cloud Messaging supports the communication infrastructure in the system.The prototype is evaluated based on a list of criteria for an adequate system. In this paper, an Android based home automation system that allows multiple users to control the appliances by an Android application or through a web site is presented. The system has three hardware components: a local device to transfer signals to home appliances, a web server to store customer records and support services to the other components, and a mobile smart device running Android application. Distributed cloud platforms and Google services are used to support messaging between the components. The prototype implementation of the proposed system is evaluated based on the criteria considered after the requirement analysis for an adequate home automation system. The paper presents the outcomes of a survey carried out regarding the properties of home automation systems, and also the evaluation results of the experimental tests conducted with volunteers on running prototype.
2013 High Capacity Optical Networks and Emerging/Enabling Technologies | 2013
Alper Gurek; Caner Gur; Cagri Gurakin; Mustafa Akdeniz; Senem Kumova Metin; Ilker Korkmaz
In recent years, the number of network enabled digital devices at homes has been increasing fast. With the rapid expansion of the Internet, the owners have been requesting remote control and monitoring of these in-home appliances. This leads to networking these appliances to form a kind of home automation system. In this paper, an Android based home automation system that allows multiple users to control the appliances by an Android application or through a Web site is presented. The system has three hardware components: a local device to transfer signals to home appliances, a Web server to store customer records and support services to the other components, and a mobile smart device running Android application. Distributed cloud platforms and services of Google are used to support messaging between the components. The prototype implementation of the proposed system is evaluated based on the criteria considered after the requirement analysis for an adequate home automation system.
international conference natural language processing | 2010
Senem Kumova Metin; Bahar Karaoglan
Collocation is the combination of words in which words appear together more often than by chance. Since collocations are blocks of meaning, they play an important role in natural language processing applications (word sense disambiguation, part of speech tagging, machine translation, etc). In this study, a corpus of Turkish is subjected to the following statistical techniques: frequency of occurrence, mutual information and hypothesis tests. We have utilized both stemmed and surface form of corpus to explore the effect of stemming in collocation extraction. The techniques are evaluated by recall and precision measures. Chi-square hypothesis test and mutual information methods have produced better results compared to other methods on Turkish corpus. In addition, we have found that a stemmed corpus facilitates discrimination between successful and unsuccessful collocation extraction methods.
Journal of Quantitative Linguistics | 2011
Senem Kumova Metin; Bahar Karaoglan
Abstract In all natural languages, some words collocate with other words to create multi-worded blocks of meaning – the collocations. Since identification of collocations is vital for information retrieval, language learning, psycholinguistics, authorship determination and translation, collocation extraction is an important issue in natural language processing. In this paper we present a method which is designed to improve current statistical methods that generate ranked lists of collocation candidates. Due to meaning integrity, any word in a collocation must suggest or at least imply the subsequent words composing the collocation. As a result, we may state that the words in a random text differ in the tendency to facilitate the prediction of the next word. If a word helps the prediction then it tends to collocate, otherwise it does not. In this paper, an attempt has been made to extract collocations by measuring collocation tendency of words and word combinations. The method used is to filter out free word pairs (the words that do not facilitate the prediction of the next word or those in which meaning integrity has not been completed yet) in the lists of candidate pairs. Collocation tendency method is tested on a base data set extracted by some statistical collocation extraction techniques (frequency of occurrence, point-wise mutual information, the t-test, chi-square techniques) and is evaluated by precision and recall measures. We have found that collocation tendency method brings a remarkable improvement on frequency of occurrence and the t-test techniques.
Expert Systems With Applications | 2018
Senem Kumova Metin
Abstract In multiword expression (MWE) recognition, there exist many studies where different learning methods are employed to decide whether given word combination is a multiword expression. The recognition methods commonly utilize a number of features that are extracted from a data source, frequently from the given text. Though the recognition methods and the features are well studied, we believe that to achieve the best possible performance with a learning method, different subsets of features should also be considered and the best performing subset must be selected. In this paper, we propose a procedure that covers the performance comparison of well-known feature selection methods to obtain the best feature subset in MWE recognition. The evaluation tests are performed on a Turkish MWE data set and the performance is measured by precision, recall and F1 values. The highest F1 value =0.731 is obtained by C4.5 classifier employing either wrapper or filtering method in feature selection. In the regarding setting(s), it is examined that the performance is increased by 1.11% compared to the setting where all features are employed in classification. Based on the experimental results, it may be stated that feature selection improves the performance of MWE recognition by eliminating the noisy/non-effective features. Moreover, it is obvious that proposed feature selection method contributes to the overall MWE recognition system by reducing the measurement and storage requirements due to the lower number of features in classification, providing a faster and more-cost effective learning model.
signal processing and communications applications conference | 2017
Senem Kumova Metin; Mehmet Taze; Hande Aka Uymaz; Erdem Okur
Detection of multiword expressions is an important pre-task in several research topics such as natural language understanding, automatic text summarization, and machine translation in the area of natural language processing. In this study, detection of multiword expressions in Turkish texts is accepted as a classification problem. 6 types of linguistic features are defined solving this problem in Turkish texts. The classification tests are performed by 10 different classifiers utilizing the prepared data set. The performance of classifiers is measured for different sizes of random train-test sets by running the tests 10 times. The test results showed that linguistic features can be used in identification of multiword expressions. And it is observed that SMO and J48 algorithms reached the highest classification performances based on different evaluation metrics.
signal processing and communications applications conference | 2013
Senem Kumova Metin; Tarik Kisla; Bahar Karaoglan
Natural language processing can be seen as a signal processing problem when the characters, syllabi, words, punctuations in a text are considered as signals. In this article, we present a novel approach that detects text similarity in Turkish, based on the similarities of the lists of retrieved documents when the texts are given as queries to web search engines. The similarities between the URLs contained in the items of the returned lists are measured using statistical methods like euclidean, city-block, chebychev, cosine, correlation, spearman and hamming distances. For experimenting, a corpus of 150 news is developed by gathering news in 50 different topics from 3 Turkish newspapers published during a certain time slot. News on the same topic published in different newspapers are considered as similar texts. Statistical methods are applied on the formed newsXterms matrix; and for each news similar news are ranked from the most similar to least similar. If at least one of the top two is the same with the ones marked manully as similar, it is counted as success. Experimental results show that cosines and correlation distances give the best performance with 84% precision.
intelligent human computer interaction | 2017
Senem Kumova Metin
Multiword expressions (MWEs) are units in language where multiple words unite without an obvious/known reason. Since MWEs occupy a prominent amount of space in both written and spoken language materials, identification of MWEs is accepted to be an important task in natural language processing.
2017 2nd International Conference on Computer and Communication Systems (ICCCS) | 2017
Senem Kumova Metin; Mehmet Taze
In this paper, we propose a procedure employing natural language processing methods to build a golden standard multiword expression data set and present our Turkish MWE data set of 3946 positive and 4230 negative candidates that is built following the proposed procedure. The proposed procedure covers three main tasks. The first task is collecting a variety of MWE data resources in order to extract MWE candidates. We suggest the use of corpora together with idiom and term dictionaries. Second task in building MWE data set is extracting different types of MWE candidates from the resources. Here, we suggest the aggregation of four methods. Firstly, statistical methods are applied to extract MWE candidates that have high occurrence frequencies. Secondly, the linguistic properties such as part of speech patterns are considered to select MWE candidates. Thirdly, the candidates that mimic the properties of idioms or are already true idioms are chosen. Lastly, the candidates with domain specific properties, term-similar, are extracted. The final task to build a golden standard MWE data set is the labeling. In this task, the candidates are labeled either as MWE or non-MWE by multiple judges.
conference on intelligent text processing and computational linguistics | 2016
Bahar Karaoglan; Tarik Kisla; Senem Kumova Metin
Because developing a corpus requires a long time and lots of human effort, it is desirable to make it as resourceful as possible: rich in coverage, flexible, multipurpose and expandable. Here we describe the steps we took in the development of Turkish paraphrase corpus, the factors we considered, problems we faced and how we dealt with them. Currently our corpus contains nearly 4000 sentences with the ratio of 60% paraphrase and 40% non-paraphrase sentence pairs. The sentence pairs are annotated at 5-scale: paraphrase, encapsulating, encapsulated, non-paraphrase and opposite. The corpus is formulated in a database structure integrated with Turkish dictionary. The sources we used till now are news texts from Bilcon 2005 corpus, a set of professionally translated sentence pairs from MSRP corpus, multiple Turkish translations from different languages that are involved in Tatoeba corpus and user generated paraphrases.