Ivan Koychev | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ivan Koychev is active.

Explore More

Publication

Featured researches published by Ivan Koychev.

intelligent user interfaces | 2000

Learning to recommend from positive evidence

Ingo Schwab; Wolfgang Pohl; Ivan Koychev

In recent years, many systems and approaches for recommending information, products or other objects have been developed. In these systems, often machine learning methods that need training input to acquire a user interest profile are used. Such methods typically need positive and negative evidence of the users interests. To obtain both kinds of evidence, many systems make users rate relevant objects explicitly. Others merely observe the users behavior, which fairly obviously yields positive evidence; in order to be able to apply the standard learning methods, these systems mostly use heuristics that attempt to find also negative evidence in observed behavior. In this paper, we present several approaches to learning interest profiles from positive evidence only, as it is contained in observed user behavior. Thus, both the problem of interrupting the user for ratings and the problem of somewhat artificially determining negative evidence are avoided. The learning approaches were developed and tested in the context of the Web-based ELFI information system. It is in real use by more than 1000 people. We give a brief sketch of ELFI and describe the experiments we made based on ELFI usage logs to evaluate the different proposed methods.

Lecture Notes in Computer Science | 2004

Feature Selection and Generalisation for Retrieval of Textual Cases

Ivan Koychev; Stewart Massie

Textual CBR systems solve problems by reusing experiences that are in textual form. Knowledge-rich comparison of textual cases remains an important challenge for these systems. However mapping text data into a structured case representation requires a significant knowledge engineering effort. In this paper we look at automated acquisition of the case indexing vocabulary as a two step process involving feature selection followed by feature generalisation. Boosted decision stumps are employed as a means to select features that are predictive and relatively orthogonal. Association rule induction is employed to capture feature co-occurrence patterns. Generalised features are constructed by applying these rules. Essentially, rules preserve implicit semantic relationships between features and applying them has the desired effect of bringing together cases that would have otherwise been overlooked during case retrieval. Experiments with four textual data sets show significant improvement in retrieval accuracy whenever generalised features are used. The results further suggest that boosted decision stumps with generalised features to be a promising combination.

european conference on information retrieval | 2004

Within-Document Retrieval: A User-Centred Evaluation of Relevance Profiling

David J. Harper; Ivan Koychev; Yixing Sun; Iain Pirie

We present a user-centred, task-oriented, comparative evaluation of two within-document retrieval tools. ProfileSkim computes a relevance profile for a document with respect to a query, and presents the profile as an interactive bar graph. FindSkim provides similar functionality to the web browser “Find” command. A novel simulated work task was devised, where participants are asked to identify (index) relevant pages of an electronic book, given topics from the existing book index. The original book index provides the ground truth, against which the indexing results of the participants can be compared. We confirmed a major hypothesis, namely ProfileSkim proved significantly more efficient than Find-Skim, as measured by time for task. The study indicates that ProfileSkim was as least as effective as FindSkim in identifying relevant pages, as measured by traditional information retrieval measures, and there is some evidence that ProfileSkim is a precision-enhancing tool. Based on qualitative data from questionnaires, we also provide strong evidence to support our conjecture that the participants would be more satisfied when using ProfileSkim than FindSkim. The experimental study confirmed the potential of relevance profiling for improving within-document retrieval. Relevance profiling should prove highly beneficial for users trying to identify relevant information within long documents.

International Conference on Innovative Techniques and Applications of Artificial Intelligence | 2005

Tracking Drifting Concepts by Time Window Optimisation

Ivan Koychev; Robert Lothian

This paper addresses the task of learning concept descriptions from streams of data. As new data are obtained the concept description has to be updated regularly to include the new data. In this case we can face the problem that the concept changes over time. Hence the old data become irrelevant to the current concept and have to be removed from the training dataset. This problem is known in the area of machine learning as concept drift. We develop a mechanism that tracks changing concepts using an adaptive time window. The method uses a significance test to detect concept drift and then optimizes the size of the time window, aiming to maximise the classification accuracy on recent data. The method presented is general in nature and can be used with any learning algorithm. The method is tested with three standard learning algorithms (kNN, ID3 and NBC). Three datasets have been used in these experiments. The experimental results provide evidence that the suggested forgetting mechanism is able significantly to improve predictive accuracy on changing concepts.

european conference on information retrieval | 2003

Query-based document skimming: a user-centred evaluation of relevance profiling

David J. Harper; Ivan Koychev; Yixing Sun

We present a user-centred, task-oriented, comparative evaluation of two query-based document skimming tools. ProfileSkim bases within-document retrieval on computing a relevance profile for a document and query; FindSkim provides similar functionality to the web browser Find-command. A novel simulated work task was devised, where experiment participants are asked to identify (index) relevant pages of an electronic book, given subjects from the existing book index. This subject index provides the ground truth, against which the indexing results can be compared. Our major hypothesis was confirmed, namely ProfileSkim proved significantly more efficient than Find-Skim, as measured by time for task. Moreover, indexing task effectiveness, measured by typical IR measures, demonstrated that ProfileSkim was better than FindSkim in identifying relevant pages, although not significantly so. The experiments confirm the potential of relevance profiling to improve query-based document skimming, which should prove highly beneficial for users trying to identify relevant information within long documents.

artificial intelligence methodology systems applications | 2016

In Search of Credible News

Momchil Hardalov; Ivan Koychev; Preslav Nakov

We study the problem of finding fake online news. This is an important problem as news of questionable credibility have recently been proliferating in social media at an alarming scale. As this is an understudied problem, especially for languages other than English, we first collect and release to the research community three new balanced credible vs. fake news datasets derived from four online sources. We then propose a language-independent approach for automatically distinguishing credible from fake news, based on a rich feature set. In particular, we use linguistic (n-gram), credibility-related (capitalization, punctuation, pronoun use, sentiment polarity), and semantic (embeddings and DBPedia data) features. Our experiments on three different testsets show that our model can distinguish credible from fake news with very high accuracy.

european conference on machine learning | 2005

A propositional approach to textual case indexing

Robert Lothian; Sutanu Chakraborti; Ivan Koychev

Problem solving with experiences that are recorded in text form requires a mapping from text to structured cases, so that case comparison can provide informed feedback for reasoning. One of the challenges is to acquire an indexing vocabulary to describe cases. We explore the use of machine learning and statistical techniques to automate aspects of this acquisition task. A propositional semantic indexing tool, Psi, which forms its indexing vocabulary from new features extracted as logical combinations of existing keywords, is presented. We propose that such logical combinations correspond more closely to natural concepts and are more transparent than linear combinations. Experiments show Psi-derived case representations to have superior retrieval performance to the original keyword-based representations. Psi also has comparable performance to Latent Semantic Indexing, a popular dimensionality reduction technique for text, which unlike Psi generates linear combinations of the original features.

north american chapter of the association for computational linguistics | 2016

PMI-cool at SemEval-2016 Task 3: Experiments with PMI and Goodness Polarity Lexicons for Community Question Answering.

Daniel Balchev; Yasen Kiprov; Ivan Koychev; Preslav Nakov

We describe our submission to SemEval-2016 Task 3 on Community Question Answering. We participated in subtask A, which asks to rerank the comments from the thread for a given forum question from good to bad. Our approach focuses on the generation and use of goodness polarity lexicons, similarly to the sentiment polarity lexicons, which are very popular in sentiment analysis. In particular, we use a combination of bootstrapping and pointwise mutual information to estimate the strength of association between a word (from a large unannotated set of question-answer threads) and the class of good/bad comments. We then use various features based on these lexicons to train a regression model, whose predictions we use to induce the final comment ranking. While our system was not very strong as it lacked important features, our lexicons contributed to the strong performance of another top-performing system.

web intelligence, mining and semantics | 2012

Computationally effective algorithm for information extraction and online review mining

Boris Kraychev; Ivan Koychev

The World Wide Web provides continuous sources of information with similar semantic structure like news feeds, user reviews and user comments on various topics. These sources are essential for the goal of online opinion mining. The paper proposes a computationally efficient algorithm for structured information extraction from web pages. The algorithm relies on a combination of analysis of structured data and natural language processing of text content. It maps HTML pages containing news, reviews or user comments to a custom designed RSS feed like structure. Such information usually includes the textual opinions, and factual information like publication date, product price, author name and influence. Due to the real time nature of the data sources the computational complexity of such a solution should be linear or close to linear. The computational complexity of the proposed algorithm is linear. In comparison similar previously published approaches have complexity no smaller than O(n2). Further we conduct experiments with real world data that achieves extraction accuracy of 84% to 92% which is comparable to the recent results in this field. Finally the paper discuses the results of the experiment and shares gained experience that can be useful for applying the algorithm in other domains.

information technology based higher education and training | 2011

Emerging models and e-infrastructures for teacher education

Krassen Stefanov; Roumen Nikolov; Pavel Boytchev; Eliza Stefanova; Atanas Georgiev; Ivan Koychev; Nikolina Nikolova; Alexander Grigorov

The paper presents a digital repository of metadata resources for teachers education, as well as a portal for the community of practices, build around the repository. Both the repository and the community are developed in the frame of the European project Share.TEC. Some approaches for endowing digital libraries with adaptability capabilities in order to scaffold and enhance end user experience are examined. The paper provides a general overview of techniques and methods commonly adopted for achieving adaptability. It also discusses how these can be implemented. Finally, it illustrates specific examples and guidelines drawn from the practical experience that the authors are currently gaining in the Share.TEC European project. In this context the adaptability is a key for managing and responding to considerable diversity in user requirements.

Explore More