Richard Khoury | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Richard Khoury is active.

Explore More

Publication

Featured researches published by Richard Khoury.

Information Systems Frontiers | 2013

Web 2.0 Recommendation service by multi-collaborative filtering trust network algorithm

Chen Wei; Richard Khoury; Simon Fong

Recommendation Services (RS) are an essential part of online marketing campaigns. They make it possible to automatically suggest advertisements and promotions that fit the interests of individual users. Social networking websites, and the Web 2.0 in general, offer a collaborative online platform where users socialize, interact and discuss topics of interest with each other. These websites have created an abundance of information about users and their interests. The computational challenge however is to analyze and filter this information in order to generate useful recommendations for each user. Collaborative Filtering (CF) is a recommendation service technique that collects information from a user’s preferences and from trusted peer users in order to infer a new targeted suggestion. CF and its variants have been studied extensively in the literature on online recommending, marketing and advertising systems. However, most of the work done was based on Web 1.0, where all the information necessary for the computations is assumed to always be completely available. By contrast, in the distributed environment of Web 2.0, such as in current social networks, the required information may be either incomplete or scattered over different sources. In this paper, we propose the Multi-Collaborative Filtering Trust Network algorithm, an improved version of the CF algorithm designed to work on the Web 2.0 platform. Our simulation experiments show that the new algorithm yields a clear improvement in prediction accuracy compared to the original CF algorithm.

soft computing | 2016

Hierarchical classification in text mining for sentiment analysis of online news

Jinyan Li; Simon Fong; Yan Zhuang; Richard Khoury

Sentiment analysis in text mining is a challenging task. Sentiment is subtly reflected by the tone and affective content of a writer’s words. Conventional text mining techniques, which are based on keyword frequencies, usually run short of accurately detecting such subjective information implied in the text. In this paper, we evaluate several popular classification algorithms, along with three filtering schemes. The filtering schemes progressively shrink the original dataset with respect to the contextual polarity and frequent terms of a document. We call this approach “hierarchical classification”. The effects of the approach in different combination of classification algorithms and filtering schemes are discussed over three sets of controversial online news articles where binary and multi-class classifications are applied. Meanwhile we use two methods to test this hierarchical classification model, and also have a comparison of the two methods.

International Journal of Advanced Media and Communication | 2008

Keyword extraction rules based on a part-of-speech hierarchy

Richard Khoury; Fakhreddine Karray; Mohamed S. Kamel

In this paper, we set out to present an original rule-learning algorithm for symbolic Natural Language Processing (NLP), designed to learn the rules of extraction of keywords marked in its training sentences. What really sets our methodology apart from other recent developments in the field of NLP is the implementation of a hierarchy of parts-of-speech at the very core of the algorithm. This makes the rules dependent only on the sentences structure rather than on context and domain-specific information. The theoretical development and the experimental results support the conclusion that this improved methodology can be used to obtain an in-depth analysis of the text without being limited to a single domain of application. Consequently, it has the advantage of outperforming both traditional statistical and symbolic NLP methodologies.

IEEE Transactions on Fuzzy Systems | 2007

Semantic Understanding of General Linguistic Items by Means of Fuzzy Set Theory

Richard Khoury; Fakhri Karray; Yu Sun; Mohamed S. Kamel; Otman A. Basir

Modern statistical techniques used in the field of natural language processing are limited in their applications by the fact they suffer from the loss of most of the semantic information contained in text documents. Fuzzy techniques have been proposed as a way to correct this problem through the modelling of the relationships between words while accommodating the ambiguities of natural languages. However, these techniques are currently either restricted to modelling the effects of simple words or are specialized in a single domain. In this paper, we propose a novel statistical-fuzzy methodology to represent the actions described in a variety of text documents by modelling the relationships between subject-verb-object triplets. The research will focus in the first place on the technique used to accurately extract the triplets from the text, on the necessary equations to compute the statistics of the subject-verb and verb-object pairs, and on the formulas needed to interpolate the fuzzy membership functions from these statistics and on those needed to de fuzzify the membership value of unseen triplets. Taken together, these sets of equations constitute a comprehensive system that allows the quantification and evaluation of the meaning of text documents, while being general enough to be applied to any domain. In the second phase, this paper will proceed to experimentally demonstrate the validity of our new methodology by applying it to the implementation of a fuzzy classifier conceived especially for this research. This classifier is trained using a section of the Brown Corpus, and its efficiency is tested with a corpus of 20 unseen documents drawn from three different domains. The positive results obtained from these experimental tests confirm the soundness of our new approach and show that it is a promising avenue of research.

2013 International Symposium on Computational and Business Intelligence | 2013

Sentiment Analysis of Online News Using MALLET

Simon Fong; Yan Zhuang; Jinyan Li; Richard Khoury

The challenge of sentiment analysis consists in automatically determining whether a text is positive or negative in tone. Part of the difficulty in this task stems from the fact that the same words or sentences can have very different sentimental meaning given their context. In our work, we further focus on news articles, which tend to use a more neutral vocabulary, as opposed to the emotionally charged vocabulary of opinion pieces such as editorials, reviews, and blogs. In this paper, we use MALLET (Machine Learning for Language Toolkit) to implement and train several algorithms for sentiment analysis, and run experiments to compare and contrast them.

autonomous and intelligent systems | 2011

Exploring wikipedia's category graph for query classification

Milad Alemzadeh; Richard Khoury; Fakhri Karray

Wikipedias category graph is a network of 400,000 interconnected category labels, and can be a powerful resource for many classification tasks. However, its size and the lack of order can make it difficult to navigate. In this paper, we present a new algorithm to efficiently explore this graph and discover accurate classification labels. We implement our algorithm as the core of a query classification system and demonstrate its reliability using the KDD CUP 2005 competition as a benchmark.

International Journal of Intelligent Information and Database Systems | 2011

Query classification using Wikipedia

Richard Khoury

Identifying the intended topic that underlies a users query can benefit a large range of applications, from search engines to question-answering systems. However, query classification remains a difficult challenge due to the variety of queries a user can ask, the wide range of topics users can ask about, and the limited amount of information that can be mined from the query. In this paper, we develop a new query classification system that accounts for these three challenges. Our system relies on the freely-available online encyclopedia Wikipedia as a natural-language knowledge-based, and exploits Wikipedias structure to infer the correct classification of any given query. We will present two variants of this query classification system in this paper, and demonstrate their reliability compared to each other and to the literature benchmarks using the query sets from the KDD CUP 2005 and TREC 2007 competitions.

international conference natural language processing | 2005

Semantic context classification by means of fuzzy set theory

Yu Sun; Richard Khoury; Fakhri Karray; Otman A. Basir

Comprehension of semantic meaning is at the heart of modem natural language processing (NLP). Currently, research in statistical NLP has focused primarily on the statistical representation of lexical combinational occurrences. Due to the limitations of current computer technology, however, representing a lexical combination is restricted to a finite length. As such, we focus attention on obtaining approximate but simpler and satisfactory solutions through soft computing techniques; in particular, the fuzzy set theory. It is difficult, however, to apply conventional fuzzy membership functions for general linguistic items. As such, we specifically propose a method of constructing membership functions for linguistic items based on the level of semantic patterns. For testing purpose, the proposed methodology is applied in text classification and the accompanying experimental results are compared with the output provided by a probabilistic based approach.

international conference on computational collective intelligence | 2015

Text Classification Using Novel “Anti-Bayesian” Techniques

B. John Oommen; Richard Khoury; Aron Schmidt

This paper presents a non-traditional “Anti-Bayesian” solution for the traditional Text Classification (TC) problem. Historically, all the recorded TC schemes work using the fundamental paradigm that once the statistical features are inferred from the syntactic/semantic indicators, the classifiers themselves are the well-established statistical ones. In this paper, we shall demonstrate that by virtue of the skewed distributions of the features, one could advantageously work with information latent in certain “non-central” quantiles (i.e., those distant from the mean) of the distributions. We, indeed, demonstrate that such classifiers exist and are attainable, and show that the design and implementation of such schemes work with the recently-introduced paradigm of Quantile Statistics (QS)-based classifiers. These classifiers, referred to as Classification by Moments of Quantile Statistics (CMQS), are essentially “Anti”-Bayesian in their modus operandi. To achieve our goal, in this paper we demonstrate the power and potential of CMQS to describe the very high-dimensional TC-related vector spaces in terms of a limited number of “outlier-based” statistics. Thereafter, the PR task in classification invokes the CMQS classifier for the underlying multi-class problem by using a linear number of pair-wise CMQS-based classifiers. By a rigorous testing on the standard 20-Newsgroups corpus we show that CMQS-based TC attains accuracy that is comparable to the best-reported classifiers. We also propose the potential of fusing the results of a CMQS-based method with those obtained from a traditional scheme.

autonomous and intelligent systems | 2011

Question type classification using a part-of-speech hierarchy

Richard Khoury

Question type (or answer type) classification is the task of determining the correct type of the answer expected to a given query. This is often done by defining or discovering syntactic patterns that represent the structure of typical queries of each type, and classify a given query according to which pattern they satisfy. In this paper, we combine the idea of using informer spans as patterns with our own part-of-speech hierarchy in order to propose both a new approach to pattern-based question type classification and a new way of discovering the informers to be used as patterns. We show experimentally that using our part-of-speech hierarchy greatly improves type classification results, and allows our system to learn valid new informers.

Explore More