Gulden Uchyigit | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gulden Uchyigit is active.

Explore More

Publication

Featured researches published by Gulden Uchyigit.

International Journal of Pattern Recognition and Artificial Intelligence | 2007

A NEW FEATURE SELECTION METHOD FOR TEXT CLASSIFICATION

Gulden Uchyigit; Keith L. Clark

Text classification is the problem of classifying a set of documents into a pre-defined set of classes. A major problem with text classification problems is the high dimensionality of the feature space. Only a small subset of these words are feature words which can be used in determining a documents class, while the rest adds noise and can make the results unreliable and significantly increase computational time. A common approach in dealing with this problem is feature selection where the number of words in the feature space are significantly reduced. In this paper we present the experiments of a comparative study of feature selection methods used for text classification. Ten feature selection methods were evaluated in this study including the new feature selection method, called the GU metric. The other feature selection methods evaluated in this study are: Chi-Squared (χ2) statistic, NGL coefficient, GSS coefficient, Mutual Information, Information Gain, Odds Ratio, Term Frequency, Fisher Criterion, BSS/WSS coefficient. The experimental evaluations show that the GU metric obtained the best F1 and F2 scores. The experiments were performed on the 20 Newsgroups data sets with the Naive Bayesian Probabilistic Classifier.

fuzzy systems and knowledge discovery | 2012

Experimental evaluation of feature selection methods for text classification

Gulden Uchyigit

In this paper we present the experiments of a comparative study of feature selection methods used for text classification. Ten feature selection methods were evaluated in this study, including a new feature selection method, called the GU metric. The other feature selection methods evaluated in this study are: Chi-Squared (χ2) statistic, NGL coefficient, GSS coefficient, Mutual Information, Information Gain, Odds Ratio, Term Frequency, Fisher Criterion, BSS/WSS coefficient. The experimental evaluations show that the GU metric obtained the best F1 and F2 scores. The experiments were performed on the 20 Newsgroups data sets with the Naive Probabilistic Classifier.

Archive | 2009

Semantically Enhanced Web Personalization

Gulden Uchyigit

The amount of information available on the World Wide Web is growing at an unprecedented rate, making it very difficult for users to find interesting information. This situation is likely worsen in the future unless the end user has the available tools to assist them.

intelligent data engineering and automated learning | 2004

Hierarchical Agglomerative Clustering for Agent-Based Dynamic Collaborative Filtering

Gulden Uchyigit; Keith L. Clark

Collaborative Filtering systems suggest items to a user because it is highly rated by some other user with similar tastes. Although these systems are achieving great success on web based applications, the tremendous growth in the number of people using these applications require performing many recommendations per second for millions of users. Technologies are needed that can rapidly produce high quality recommendations for large community of users.

research challenges in information science | 2017

A research paper recommender system using a Dynamic Normalized Tree of Concepts model for user modelling

Modhi Al Alshaikh; Gulden Uchyigit; Roger Evans

The enormous growth of information on the Internet makes finding information challenging and time consuming. Recommender systems provide a solution to this problem by automatically capturing user interests and recommending related information the user may also find interesting. In this paper, we present a novel recommender system for the research paper domain using a Dynamic Normalized Tree of Concepts (DNTC) model. Our system improves existing vector and tree of concepts models to be adaptable with a complex ontology and a large number of papers. The proposed system uses the 2012 version of the ACM Computing Classification System (CCS) ontology. This ontology has a much deeper structure than previous versions, which makes it challenging for previous ontology-based approaches to recommender systems. We performed offline evaluations using papers provided by ACM digital library for classifier training, and papers provided by CiteSeerX digital library for measuring the performance of the proposed DNTC model. Our evaluation results show that the novel DNTC model significantly outperforms the other two models: non-normalized tree of concepts and the vector of concepts models. Further, our DNTC model provides high average precision and reliable results when used in a context which the user has multiple interests and reads a large quantity of papers over time.

recent advances in natural language processing | 2017

A Calibration Method for the Evaluation of Sentiment Analysis

F. Sharmila Satthar; Roger Evans; Gulden Uchyigit

Sentiment analysis is the computational task of extracting sentiment from a text document – for example whether it expresses a positive, negative or neutral opinion. Various approaches have been introduced in recent years, using a range of different techniques to extract sentiment information from a document. Measuring these methods against a gold standard dataset is a useful way to evaluate such systems. However, different sentiment analysis techniques represent sentiment values in different ways, such as discrete categorical classes or continuous numerical sentiment scores. This creates a challenge for evaluating and comparing such systems; in particular assessing numerical scores against datasets that use fixed classes is difficult, because the numerical outputs have to be mapped onto the ordered classes. This paper proposes a novel calibration technique that uses precision vs. recall curves to set class thresholds to optimize a continuous sentiment analyser’s performance against a discrete gold standard dataset. In experiments mapping a continuous score onto a three-class classification of movie reviews, we show that calibration results in a substantial increase in f-score when compared to a non-calibrated mapping.

international conference on knowledge discovery and information retrieval | 2017

Predicting Future Interests in a Research Paper Recommender System using a Community Centric Tree of Concepts Model.

Modhi Al Alshaikh; Gulden Uchyigit; Roger Evans

Our goal in this paper is to predict a user’s future interests in the research paper domain. Content-based recommender systems can recommend a set of papers that relate to a user’s current interests. However, they may not be able to predict a user’s future interests. Collaborative filtering approaches may predict a user’s future interests for movies, music or e-commerce domains. However, existing collaborative filtering approaches are not appropriate for the research paper domain, because they depend on large numbers of user ratings which are not available in the research paper domain. In this paper, we present a novel collaborative filtering method that does not depend on user ratings. Our novel method computes the similarity between users according to user profiles which are represented using the dynamic normalized tree of concepts model using the 2012 ACM Computing Classification System (CCS) ontology. Further, a community-centric tree of concepts is generated and used to make recommendations. Offline evaluations are performed using the BibSonomy dataset. Our model is compared with two baselines. The results show that our model significantly outperforms the two baselines and avoids the problem of sparsity.

international conference on knowledge discovery and information retrieval | 2017

A Novel Short-term and Long-term User Modelling Technique for a Research Paper Recommender System.

Modhi Al Alshaikh; Gulden Uchyigit; Roger Evans

Modelling users’ interests accurately is an important aspect of recommender systems. However, this is a challenge as users’ behaviour can vary in different domains. For example, users’ reading behaviour of research papers follows a different pattern to users’ reading of online news articles. In the case of research papers, our analysis of users’ reading behaviour shows that there are breaks in reading whereas the reading of news articles is assumed to be more continuous. In this paper, we present a novel user modelling method for representing short-term and long-term user’s interests in recommending research papers. The short-term interests are modelled using a personalised dynamic sliding window which is able to adapt its size according to the ratio of concepts per paper read by the user rather than purely time-based methods. Our long-term model is based on selecting papers that represent user’s longer term interests to build his/her profile. Existing methods for modelling user’s short-term and long-term interests do not adequately take into consideration erratic reading behaviours over time that are exhibited in the research paper domain. We conducted evaluations of our short-term and long-term models and compared them with the performance of three existing methods. The evaluation results show that our models significantly outperform the existing short-term and

international conference on digital health | 2017

Using Machine Learning for Automatic Identification of Evidence-Based Health Information on the Web

Majed Al-Jefri; Roger Evans; Pietro Ghezzi; Gulden Uchyigit

Automatic assessment of the quality of online health information is a need especially with the massive growth of online content. In this paper, we present an approach to assessing the quality of health webpages based on their content rather than on purely technical features, by applying machine learning techniques to the automatic identification of evidence-based health information. Several machine learning approaches were applied to learn classifiers using different combinations of features. Three datasets were used in this study for three different diseases, namely shingles, flu and migraine. The results obtained using the classifiers were promising in terms of precision and recall especially with diseases with few different pathogenic mechanisms.

international conference on enterprise information systems | 2003