Is this you? Create Your Porfile

Masoud Makrehchi

University of Ontario Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Masoud Makrehchi is active.

Explore More

Publication

Featured researches published by Masoud Makrehchi.

Lecture Notes in Computer Science | 2003

Generation of fuzzy membership function using information theory measures and genetic algorithm

Masoud Makrehchi; Otman A. Basir; Mohamed S. Kamel

One of the most challenging issues in fuzzy systems design is generating suitable membership functions for fuzzy variables. This paper proposes a paradigm of applying an information theoretic model to generate fuzzy membership functions. After modeling fuzzy membership function by fuzzy partitions, a genetic algorithm based optimization technique is presented to find sub optimal fuzzy partitions. To generate fuzzy membership function based on fuzzy partitions, a heuristic criterion is also defined. Extensive numerical results and evaluation procedure are provided to demonstrate the effectiveness of the proposed paradigm.

Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013

Stock Prediction Using Event-Based Sentiment Analysis

Masoud Makrehchi; Sameena Shah; Wenhui Liao

We propose a novel approach to label social media text using significant stock market events (big losses or gains). Since stock events are easily quantifiable using returns from indices or individual stocks, they provide meaningful and automated labels. We extract significant stock movements and collect appropriate pre, post and contemporaneous text from social media sources (for example, tweets from twitter). Subsequently, we assign the respective label (positive or negative) for each tweet. We train a model on this collected set and make predictions for labels of future tweets. We aggregate the net sentiment per each day (amongst other metrics) and show that it holds significant predictive power for subsequent stock market movement. We create successful trading strategies based on this system and find significant returns over other baseline methods.

european conference on information retrieval | 2008

Automatic extraction of domain-specific stopwords from labeled documents

Masoud Makrehchi; Mohamed S. Kamel

Automatic extraction of domain-specific stopword list from a large labeled corpus is discussed. Most researches remove the stopwords using a standard stopword list, and high and low document frequencies. In this paper, a new approach for stopword extraction based on the notion of backward filter level performance and sparsity measure of training data, is proposed. First, we discuss the motivation for updating existing lists or building new ones. Second, based on the proposed backward filter-level performance, we examine the effectiveness of high document frequency filtering for stopword reduction. Finally, a new method for building general and domain-specific stopwords is proposed. The method assumes that a set of candidate stopwords must have minimum information content and prediction capacity, which can be estimated by a classifier performance. The proposed approach is extensively compared with other methods including inverse document frequency and information gain. According to the comparative study, the proposed approach offers more promising results, which guarantee minimum information loss by filtering out most stopwords.

web intelligence | 2007

Automatic Taxonomy Extraction Using Google and Term Dependency

Masoud Makrehchi; Mohamed S. Kamel

An automatic taxonomy extraction algorithm is proposed. Given a set of terms or terminology related to a subject domain, the proposed approach uses Google page count to estimate the dependency links between the terms. A taxonomic link is an asymmetric relation between two concepts. In order to extract these directed links, neither mutual information nor normalized Google distance can be employed. Using the new measure of information theoretic inclusion index, term dependency matrix, which represents the pair-wise dependencies, is obtained. Next, using a proposed algorithm, the dependency matrix is converted into an adjacency matrix, representing the taxonomy tree. In order to evaluate the performance of the proposed approach, it is applied to several domains for taxonomy extraction.

web intelligence | 2006

Learning Social Networks from Web Documents Using Support Vector Classifiers

Masoud Makrehchi; Mohamed S. Kamel

Automatic generation of a social network requires extracting pair-wise relations of the individuals. In this research, learning social network from incomplete relationship data is proposed. It is assumed that only a small subset of relations between the individuals is known. With this assumption, the social network extraction is translated into a text classification problem. The relations between two individuals are modeled by merging their document vectors and the given relations are used as labels of training data. By this transformation, a text classifier such as SVM is used for learning the unknown relations. We show that there is a link between the intrinsic sparsity of social networks and class distribution imbalance of the training data. In order to re-balance the unbalanced training data, a minority class down-sampling strategy is employed. The proposed framework is applied to a true FOAF (friend of a friend) database and evaluated by the macro-averaged F-measure

conference on recommender systems | 2011

Social link recommendation by learning hidden topics

Masoud Makrehchi

In this paper, a new approach to predicting the structure of a social network without any prior knowledge from the social links is proposed. In absence of links among nodes, we assume there are other information resources associated with the nodes which are called node profiles. The task of link prediction and recommendation from text data is to learn similarities between the nodes and then translate pair-wise similarities into social links. In other words, the process is to convert a similarity matrix into an adjacency matrix. In this paper, an alternative approach is proposed. First, hidden topics of node profiles are learned using Latent Dirichlet Allocation. Then, by mapping node-topic and topic-topic relations, a new structure called semi-bipartite graph is generated which is slightly different from regular bipartite graph. Finally, by applying topological metrics such as Katz and short path scores to the new structure, we are able to rank and recommend relevant links to each node. The proposed technique is applied to several co-authorship networks. While most link prediction methods are low precision solutions, the proposed method performs effectively and offers high precision.

international conference on social computing | 2014

Winning by Following the Winners: Mining the Behaviour of Stock Market Experts in Social Media

Wenhui Liao; Sameena Shah; Masoud Makrehchi

We propose a novel yet simple method for creating a stock market trading strategy by following successful stock market expert in social media. The problem of “how and where to invest” is translated into “who to follow in my investment”. In other words, looking for stock market investment strategy is converted into stock market expert search. Fortunately, many stock market experts are active in social media and openly express their opinions about market. By analyzing their behavior, and mining their opinions and suggested actions in Twitter, and simulating their recommendations, we are able to score each expert based on his/her performance. Using this scoring system, experts with most successful trading are recommended. The main objective in this research is to identify traders that outperform market historically, and aggregate the opinions from such traders to recommend trades.

Journal of Intelligent and Fuzzy Systems | 2011

An information theoretic approach to generating fuzzy hypercubes for if-then classifiers

Masoud Makrehchi; Mohamed S. Kamel

In this paper, a framework for automatic generation of fuzzy membership functions and fuzzy rules from training data is proposed. The main focus of this paper is designing fuzzy if-then classifiers; however the proposed method can be employed in designing a wide range of fuzzy system applications. After the fuzzy membership functions are modeled by their supports, an optimization technique, based on a multi-objective real coded genetic algorithm with adaptive cross over and mutation probabilities, is implemented to find near optimal supports. Employing interpretability constraint in parameter representation and encoding, we ensure that the generated fuzzy membership function does have a semantic meaning. The fitness function of the genetic algorithm, which estimates the quality of the generated membership functions, consists of two elements: (i) the Shannon entropy and mutual information measures to measure diversity of the data distribution in a hypercube; and (ii) the number of generated fuzzy rules addressing the measure of compactness of the fuzzy system. Finally membership functions are tuned to yield optimal classifier hypercubes, which represent the predictivity and discriminating power of the classifier. Fuzzy rules of the classifier are derived from the optimal hypercubes. Using the proposed approach to designing fuzzy if-then classifiers, we are also able to evaluate the generated membership functions and compare the results with that of other techniques which have been previously reported in the literature.Using the experimental result, we show that the proposed approach outperforms other techniques in low resolutions. It means that theproposed approach can achieve satisfying result with lower complexity.

International Journal of Pattern Recognition and Artificial Intelligence | 2011

IMPACT OF TERM DEPENDENCY AND CLASS IMBALANCE ON THE PERFORMANCE OF FEATURE RANKING METHODS

Masoud Makrehchi; Mohamed S. Kamel

Feature ranking is widely employed to deal with high dimensionality in text classification. The main advantage of feature ranking methods is their low cost and simple algorithms. However, they suffer from some drawbacks which cause low performance compared to wrapper approach feature selection methods. In this paper, three major drawbacks of feature ranking methods are discussed. First, we show that feature ranking methods are highly problem dependent. For designing an effective feature ranking method and appropriate ranking threshold, we need background knowledge including the data set characteristics as well as the classifier to be used. Second, the feature ranking methods are univariate functions, while the nature of text classification is multivariate. It means that in these methods, correlation between terms is ignored. Finally, they fail in multiple class problems with unbalanced class distribution because they pay more attention to the simpler and larger classes. In this paper, these drawbacks, especially the last two issues, are experimentally investigated using a set of extensive numerical experiments with several data sets and feature scoring measures.

international conference on digital information management | 2010

Query-relevant document representation for text clustering

Masoud Makrehchi

In text categorization, one well-known document representation is bag-of-words. Although it is simple and popular, it ignores semantics, underlying linguistic information, and word correlations. In this paper, a new representation for text data is proposed which is called Bag-Of-Queries (BOQ). First, a taxonomy of the terms in the local vocabulary is extracted. Extracting a taxonomy is performed by learning term dependencies using an information theoretic inclusion index. Next, the taxonomy is partitioned to generate a set of correlated terms or bag of queries. Since every two partitions belong to different concepts, they are considered seman-tically orthogonal queries. This provides a new space of orthogonal features, which is necessary for an efficient categorization. Finally, instead of using terms as features, we use them to build a set of queries. Documents are ranked in response to the queries using a similarity measure. The similarity indices are considered as new features in a vector space model representation. The proposed approach outperforms bag of word based clustering. It also extracts new non-redundant features and at the same time reduces dimensionality.

Explore More