Morteza Zihayat | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Morteza Zihayat is active.

Explore More

Publication

Featured researches published by Morteza Zihayat.

european conference on machine learning | 2012

Efficient bi-objective team formation in social networks

Mehdi Kargar; Aijun An; Morteza Zihayat

We tackle the problem of finding a team of experts from a social network to complete a project that requires a set of skills. The social network is modeled as a graph. A node in the graph represents an expert and has a weight representing the monetary cost for using the expert service. Two nodes in the graph can be connected and the weight on the edge represents the communication cost between the two corresponding experts. Given a project, our objective is to find a team of experts that covers all the required skills and also minimizes the communication cost as well as the personnel cost of the project. To minimize both of the objectives, we define a new combined cost function which is based on the linear combination of the objectives (i.e. communication and personnel costs). We show that the problem of minimizing the combined cost function is an NP-hard problem. Thus, one approximation algorithm is proposed to solve the problem. The proposed approximation algorithm is bounded and the approximation ratio of the algorithm is proved in the paper. Three heuristic algorithms based on different intuitions are also proposed for solving the problem. Extensive experiments on real datasets demonstrate the effectiveness and scalability of the proposed algorithms.

Information Sciences | 2014

Mining top-k high utility patterns over data streams

Morteza Zihayat; Aijun An

Online high utility itemset mining over data streams has been studied recently. However, the existing methods are not designed for producing top-k patterns. Since there could be a large number of high utility patterns, finding only top-k patterns is more attractive than producing all the patterns whose utility is above a threshold. A challenge with finding top-k high utility itemsets over data streams is that it is not easy for users to determine a proper minimum utility threshold in order for the method to work efficiently. In this paper, we propose a new method (named T-HUDS) for finding top-k high utility patterns over sliding windows of a data stream. The method is based on a compressed tree structure, called HUDS-tree, that can be used to efficiently find potential top-k high utility itemsets over sliding windows. T-HUDS uses a new utility estimation model to more effectively prune the search space. We also propose several strategies for initializing and dynamically adjusting the minimum utility threshold. We prove that no top-k high utility itemset is missed by the proposed method. Our experimental results on real and synthetic datasets show that our strategies and new utility estimation model work very effectively and that T-HUDS outperforms two state-of-the-art high utility itemset algorithms substantially in terms of execution time and memory storage.

Proceedings of the ASE BigData & SocialInformatics 2015 on | 2015

Mining High Utility Sequential Patterns from Evolving Data Streams

Morteza Zihayat; Cheng-Wei Wu; Aijun An; Vincent S. Tseng

In this paper, we define the problem of mining high utility sequential patterns (HUSPs) over high-velocity streaming data and propose an efficient algorithm for mining HUSPs over a data stream. The main challenges we tackle include how to maintain a compact summary of the data stream to reflect the evolution of sequence utilities over time and how to overcome the problem of combinatorial explosion of a search space. We propose a compact data structure named HUSP-Tree to maintain the essential information for mining HUSPs in an online fashion. An efficient and single-pass algorithm named HUSP-Stream is proposed to generate HUSPs from HUSP-Tree. HUSP-Stream uses a new utility estimation model to more effectively prune the search space. Experimental results on real and synthetic datasets show that our algorithm serves as an efficient solution to the new problem of mining high utility sequential patterns over data streams.

intelligent data analysis | 2017

Efficiently mining high utility sequential patterns in static and streaming data

Morteza Zihayat; Cheng-Wei Wu; Aijun An; Vincent S. Tseng; Chien Lin

High utility sequential pattern (HUSP) mining has emerged as a novel topic in data mining. Although some preliminary works have been conducted on this topic, they incur the problem of producing a large search space for high utility sequential patterns. In addition, they mainly focus on mining HUSPs in static databases and do not take streaming data into account, where unbounded data come continuously and often at a high speed. To efficiently deal with both problems, we propose a novel framework for mining high utility sequential patterns over static and streaming databases. In this regard, two efficient data structures named ItemUtilLists (Item Utility Lists) and HUSP-Tree (High Utility Sequential Pattern Tree) are proposed to maintain essential information for mining HUSPs in both offline and online fashions. In addition, a novel utility model called SequenceSuffix Utility is proposed for effectively pruning the search space in HUSP mining. We propose an algorithm named HUSP-Miner (High Utility Sequential Pattern Miner) to find HUSPs in static databases efficiently. Then, a one-pass algorithm named HUSP-Stream (High Utility Sequential Pattern mining over Data Streams) is proposed to incrementally update ItemUtilLists and HUSP-Tree online and find HUSPs over data streams. To the best of our knowledge, HUSP-Stream is the first method to find HUSPs over data streams. Experimental results on both real and synthetic datasets show that HUSP-Miner outperforms the compared algorithms substantially in terms of execution time, memory usage and number of generated candidates. The experiments also demonstrate impressive performance of HUSPStream to update the data structures and discover HUSPs over data streams.

Machine Learning | 2017

Memory-adaptive high utility sequential pattern mining over data streams

Morteza Zihayat; Yan Chen; Aijun An

High utility sequential pattern (HUSP) mining has emerged as an important topic in data mining. A number of studies have been conducted on mining HUSPs, but they are mainly intended for non-streaming data and thus do not take data stream characteristics into consideration. Streaming data are fast changing, continuously generated unbounded in quantity. Such data can easily exhaust computer resources (e.g., memory) unless a proper resource-aware mining is performed. In this study, we explore the fundamental problem of how limited memory can be best utilized to produce high quality HUSPs over a data stream. We design an approximation algorithm, called MAHUSP, that employs memory adaptive mechanisms to use a bounded portion of memory, in order to efficiently discover HUSPs over data streams. An efficient tree structure, called MAS-Tree, is proposed to store potential HUSPs over a data stream. MAHUSP guarantees that all HUSPs are discovered in certain circumstances. Our experimental study shows that our algorithm can not only discover HUSPs over data streams efficiently, but also adapt to memory allocation with limited sacrifices in the quality of discovered HUSPs. Furthermore, in order to show the effectiveness and efficiency of MAHUSP in real-life applications, we apply our proposed algorithm to a web clickstream dataset obtained from a Canadian news portal to showcase users’ reading behavior, and to a real biosequence database to identify disease-related gene regulation sequential patterns. The results show that MAHUSP effectively discovers useful and meaningful patterns in both cases.

Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2014

Two-Phase Pareto Set Discovery for Team Formation in Social Networks

Morteza Zihayat; Mehdi Kargar; Aijun An

In this paper, we study the problem of finding teams of experts from an expert network while optimizing three objectives. Given a project, the objective is to find teams of experts that cover all the required skills and also optimize the communication cost as well as the personnel cost and the expertise level of the team members. The expert network is modeled as a graph, where nodes represent experts and edges between nodes specify the communication costs between the experts. In this paper, we are interested in finding a Pareto front of teams that not only cover the required skills but are also not dominated by other feasible teams with respect to the three criteria. Since the problem is NP-hard, we propose algorithms to use with a two-phase method to find an approximation of the Pareto front for the three criteria team formation problem. In the first phase, an initial population which is composed of an approximation of the supported efficient teams is generated. Then, a Pareto local search method is applied to each solution of the initial population to find other members of the Pareto front. The proposed method is evaluated on the DBLP data set. The results indicate its superior performance comparing with other methods in terms of running time and the quality of answers.

BMC Systems Biology | 2017

Mining significant high utility gene regulation sequential patterns

Morteza Zihayat; Heidar Davoudi; Aijun An

BackgroundMining frequent gene regulation sequential patterns in time course microarray datasets is an important mining task in bioinformatics. Although finding such patterns are of paramount important for studying a disease, most existing work do not consider gene-disease association during gene regulation sequential pattern discovery. Moreover, they consider more absent/existence effects of genes during the mining process than taking the degrees of genes expression into account. Consequently, such techniques discover too many patterns which may not represent important information to biologists to investigate the relationships between the disease and underlying reasons hidden in gene regulation sequences.ResultsWe propose a utility model by considering both the gene-disease association score and their degrees of expression levels under a biological investigation. We propose an efficient method called Top-HUGS, for discoverying significant high utility gene regulation sequential patterns from a time-course microarray dataset.ConclusionsIn this study, the proposed methods were evaluated on a publicly available time course microarray dataset. The experimental results show higher accuracies compared to the baseline methods. Our proposed methods found that several new gene regulation sequential patterns involved in such patterns were useful for biologists and provided further insights into the mechanisms underpinning biological processes. To effectively work with the proposed method, a web interface is developed to our system using Java. To the best of our knowledge, this is the first demonstration for significant high utility gene regulation sequential pattern discovery.

international conference on big data | 2016

Distributed and parallel high utility sequential pattern mining

Morteza Zihayat; Zane Zhenhua Hut; Aijun An; Yonggang Hut

The problem of mining high utility sequential patterns (HUSP) has been studied recently. Existing solutions are mostly memory-based, which assume that data can fit into the main memory of a computer. However, with advent of big data, such an assumption does not hold any longer. Hence, existing algorithms are not applicable to the big data environments, where data are often distributed and too large to be dealt with by a single machine. In this paper, we propose a new framework for mining HUSPs in big data. A distributed and parallel algorithm called BigHUSP is proposed to discover HUSPs efficiently. At its heart, BigHUSP uses multiple MapReduce-like steps to process data in parallel. We also propose a number of pruning strategies to minimize search space in a distributed environment, and thus decrease computational and communication costs, while still maintaining correctness. Our experiments with real life and large synthetic datasets validate the effectiveness of BigHUSP for mining HUSPs from large sequence datasets.

bioinformatics and biomedicine | 2016

Top-k utility-based gene regulation sequential pattern discovery

Morteza Zihayat; Heidar Davoudi; Aijun An

Sequential pattern mining has been used in bioinformatics to discover frequent gene regulation sequential patterns based on time course microarray datasets. While mining frequent sequences are important in biological studies for disease treatment, to date, most of the approaches do not consider the importance of the genes with respect to a disease being studied when identifying gene regulation sequential patterns. In addition, they focus on the more general up/down effects of genes in a microarray dataset and do not take into account the various degrees of expression during the mining process. As a result, the current techniques return too many sequences which may not be informative enough for biologists to explore relationships between the disease and underlying causes encoded in gene regulation sequences. In this paper, we propose a utility model by considering both the importance of genes with respect to a disease and their degrees of expression levels under a biological investigation. Then, we design a new method, called TU-SEQ, for identifying top-k high utility gene regulation sequential patterns from a time-course microarray dataset. The evaluation results show that our approach can effectively and efficiently discover key patterns representing meaningful gene regulation sequential patterns in a time course microarray dataset.

knowledge discovery and data mining | 2018

Adaptive Paywall Mechanism for Digital News Media

Heidar Davoudi; Aijun An; Morteza Zihayat; Gordon Edall

Many online news agencies utilize the paywall mechanism to increase reader subscriptions. This method offers a non-subscribed reader a fixed number of free articles in a period of time (e.g., a month), and then directs the user to the subscription page for further reading. We argue that there is no direct relationship between the number of paywalls presented to readers and the number of subscriptions, and that this artificial barrier, if not used well, may disengage potential subscribers and thus may not well serve its purpose of increasing revenue. Moreover, the current paywall mechanism neither considers the user browsing history nor the potential articles which the user may visit in the future. Thus, it treats all readers equally and does not consider the potential of a reader in becoming a subscriber. In this paper, we propose an adaptive paywall mechanism to balance the benefit of showing an article against that of displaying the paywall (i.e., terminating the session). We first define the notion of cost and utility that are used to define an objective function for optimal paywall decision making. Then, we model the problem as a stochastic sequential decision process. Finally, we propose an efficient policy function for paywall decision making. The experimental results on a real dataset from a major newspaper in Canada show that the proposed model outperforms the traditional paywall mechanism as well as the other baselines.

Explore More