Seyed-Mehdi-Reza Beheshti
University of New South Wales
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Seyed-Mehdi-Reza Beheshti.
business process management | 2011
Seyed-Mehdi-Reza Beheshti; Boualem Benatallah; Hamid Reza Motahari-Nezhad; Sherif Sakr
The execution of a business process (BP) in todays enterprises may involve a workflow and multiple IT systems and services. Often no complete, up-to-date documentation of the model or correlation information of process events exist. Understanding the execution of a BP in terms of its scope and details is challenging specially as it is subjective: depends on the perspective of the person looking at BP execution. We present a framework, simple abstractions and a language for the explorative querying and understanding of BP execution from various user perspectives. We propose a query language for analyzing event logs of process-related systems based on the two concepts of folders and paths, which enable an analyst to group related events in the logs or find paths among events. Folders and paths can be stored to be used in follow-on analysis. We have implemented the proposed techniques and the language, FPSPARQL, by extending SPARQL graph query language. We present the evaluation results on the performance and the quality of the results using a number of process event logs.
very large data bases | 2015
Mohammad Hammoud; Dania Abed Rabbou; Reza Nouri; Seyed-Mehdi-Reza Beheshti; Sherif Sakr
The Resource Description Framework (RDF) and SPARQL query language are gaining wide popularity and acceptance. In this paper, we present DREAM, a distributed and adaptive RDF system. As opposed to existing RDF systems, DREAM avoids partitioning RDF datasets and partitions only SPARQL queries. By not partitioning datasets, DREAM offers a general paradigm for different types of pattern matching queries, and entirely averts intermediate data shuffling (only auxiliary data are shuffled). Besides, by partitioning queries, DREAM presents an adaptive scheme, which automatically runs queries on various numbers of machines depending on their complexities. Hence, in essence DREAM combines the advantages of the state-of-the-art centralized and distributed RDF systems, whereby data communication is avoided and cluster resources are aggregated. Likewise, it precludes their disadvantages, wherein system resources are limited and communication overhead is typically hindering. DREAM achieves all its goals via employing a novel graph-based, rule-oriented query planner and a new cost model. We implemented DREAM and conducted comprehensive experiments on a private cluster and on the Amazon EC2 platform. Results show that DREAM can significantly outperform three related popular RDF systems.
Cluster Computing | 2015
Omar Batarfi; Radwa El Shawi; Ayman G. Fayoumi; Reza Nouri; Seyed-Mehdi-Reza Beheshti; Ahmed Barnawi; Sherif Sakr
Graph is a fundamental data structure that captures relationships between different data entities. In practice, graphs are widely used for modeling complicated data in different application domains such as social networks, protein networks, transportation networks, bibliographical networks, knowledge bases and many more. Currently, graphs with millions and billions of nodes and edges have become very common. In principle, graph analytics is an important big data discovery technique. Therefore, with the increasing abundance of large graphs, designing scalable systems for processing and analyzing large scale graphs has become one of the most timely problems facing the big data research community. In general, scalable processing of big graphs is a challenging task due to their size and the inherent irregular structure of graph computations. Thus, in recent years, we have witnessed an unprecedented interest in building big graph processing systems that attempted to tackle these challenges. In this article, we provide a comprehensive survey over the state-of-the-art of large scale graph processing platforms. In addition, we present an extensive experimental study of five popular systems in this domain, namely, GraphChi, Apache Giraph, GPS, GraphLab and GraphX. In particular, we report and analyze the performance characteristics of these systems using five common graph processing algorithms and seven large graph datasets. Finally, we identify a set of the current open research challenges and discuss some promising directions for future research in the domain of large scale graph processing.
Distributed and Parallel Databases | 2016
Seyed-Mehdi-Reza Beheshti; Boualem Benatallah; Hamid Reza Motahari-Nezhad
In today’s knowledge-, service-, and cloud-based economy, businesses accumulate massive amounts of data from a variety of sources. In order to understand businesses one may need to perform considerable analytics over large hybrid collections of heterogeneous and partially unstructured data that is captured related to the process execution. This data, usually modeled as graphs, increasingly come to show all the typical properties of big data: wide physical distribution, diversity of formats, non-standard data models, independently-managed and heterogeneous semantics. We use the term big process graph to refer to such large hybrid collections of heterogeneous and partially unstructured process related execution data. Online analytical processing (OLAP) of big process graph is challenging as the extension of existing OLAP techniques to analysis of graphs is not straightforward. Moreover, process data analysis methods should be capable of processing and querying large amount of data effectively and efficiently, and therefore have to be able to scale well with the infrastructure’s scale. While traditional analytics solutions (relational DBs, data warehouses and OLAP), do a great job in collecting data and providing answers on known questions, key business insights remain hidden in the interactions among objects: it will be hard to discover concept hierarchies for entities based on both data objects and their interactions in process graphs. In this paper, we introduce a framework and a set of methods to support scalable graph-based OLAP analytics over process execution data. The goal is to facilitate the analytics over big process graph through summarizing the process graph and providing multiple views at different granularity. To achieve this goal, we present a model for process OLAP (P-OLAP) and define OLAP specific abstractions in process context such as process cubes, dimensions, and cells. We present a MapReduce-based graph processing engine, to support big data analytics over process graphs. We have implemented the P-OLAP framework and integrated it into our existing process data analytics platform, ProcessAtlas, which introduces a scalable architecture for querying, exploration and analysis of large process data. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.
asia-pacific web conference | 2013
Mohammad Allahbakhsh; Aleksandar Ignjatovic; Boualem Benatallah; Seyed-Mehdi-Reza Beheshti; Elisa Bertino; Norman Y. Foo
Online rating systems are subject to unfair evaluations. Users may try to individually or collaboratively promote or demote a product. Collaborative unfair rating, i.e., collusion, is more damaging than individual unfair rating. Detecting massive collusive attacks as well as honest looking intelligent attacks is still a real challenge for collusion detection systems. In this paper, we study impact of collusion in online rating systems and asses their susceptibility to collusion attacks. The proposed model uses frequent itemset mining technique to detect candidate collusion groups and sub-groups. Then, several indicators are used for identifying collusion groups and to estimate how damaging such colluding groups might be. The model has been implemented and we present results of experimental evaluation of our methodology.
web information systems engineering | 2012
Seyed-Mehdi-Reza Beheshti; Boualem Benatallah; Hamid Reza Motahari-Nezhad; Mohammad Allahbakhsh
Graphs are essential modeling and analytical objects for representing information networks. Existing approaches, in on-line analytical processing on graphs, took the first step by supporting multi-level and multi-dimensional queries on graphs, but they do not provide a semantic-driven framework and a language to support n-dimensional computations, which are frequent in OLAP environments. The major challenge here is how to extend decision support on multidimensional networks considering both data objects and the relationships among them. Moreover, one of the critical deficiencies of graph query languages, e.g. SPARQL, is the lack of support for n-dimensional computations. In this paper, we propose a graph data model, GOLAP, for online analytical processing on graphs. This data model enables extending decision support on multidimensional networks considering both data objects and the relationships among them. Moreover, we extend SPARQL to support n-dimensional computations. The approaches presented in this paper have been implemented on top of FPSPARQL, Folder-Path enabled extension of SPARQL, and experimentally validated on synthetic and real-world datasets.
conference on advanced information systems engineering | 2013
Seyed-Mehdi-Reza Beheshti; Boualem Benatallah; Hamid Reza Motahari-Nezhad
Processes in case management applications are flexible, knowledge-intensive and people-driven, and often used as guides for workers in processing of artifacts. An important fact is the evolution of process artifacts over time as they are touched by different people in the context of a knowledge-intensive process. This highlights the need for tracking process artifacts in order to find out their history (artifact versioning) and also provenance (where they come from, and who touched and did what on them). We present a framework, simple abstractions and a language for analyzing cross-cutting aspects (in particular versioning and provenance) over process artifacts. We introduce two concepts of timed-folders to represent evolution of artifacts over time, and activity-paths to represent the process which led to artifacts. The introduced approaches have been implemented on top of FPSPARQL, Folder-Path enabled extension of SPARQL, and experimentally validated on real-world datasets.
Computing | 2017
Seyed-Mehdi-Reza Beheshti; Boualem Benatallah; Srikumar Venugopal; Seung Hwan Ryu; Hamid Reza Motahari-Nezhad; Wei Wang
Information extraction (IE) is the task of automatically extracting structured information from unstructured/semi-structured machine-readable documents. Among various IE tasks, extracting actionable intelligence from an ever-increasing amount of data depends critically upon cross-document coreference resolution (CDCR) - the task of identifying entity mentions across information sources that refer to the same underlying entity. CDCR is the basis of knowledge acquisition and is at the heart of Web search, recommendations, and analytics. Real time processing of CDCR processes is very important and have various applications in discovering must-know information in real-time for clients in finance, public sector, news, and crisis management. Being an emerging area of research and practice, the reported literature on CDCR challenges and solutions is growing fast but is scattered due to the large space, various applications, and large datasets of the order of peta-/tera-bytes. In order to fill this gap, we provide a systematic review of the state of the art of challenges and solutions for a CDCR process. We identify a set of quality attributes, that have been frequently reported in the context of CDCR processes, to be used as a guide to identify important and outstanding issues for further investigations. Finally, we assess existing tools and techniques for CDCR subtasks and provide guidance on selection of tools and algorithms.
international world wide web conferences | 2017
Seyed-Mehdi-Reza Beheshti; Alireza Tabebordbar; Boualem Benatallah; Reza Nouri
Big data analytics is firmly recognized as a strategic priority for modern enterprises. At the heart of big data analytics lies the data curation process, consists of tasks that transform raw data (unstructured, semi-structured and structured data sources) into curated data, i.e. contextualized data and knowledge that is maintained and made available for use by end-users and applications. To achieve this, the data curation process may involve techniques and algorithms for extracting, classifying, linking, merging, enriching, sampling, and the summarization of data and knowledge. To facilitate the data curation process and enhance the productivity of researchers and developers, we identify and implement a set of basic data curation APIs and make them available as services to researchers and developers to assist them in transforming their raw data into curated data. The curation APIs enable developers to easily add features - such as extracting keyword, part of speech, and named entities such as Persons, Locations, Organizations, Companies, Products, Diseases, Drugs, etc.; providing synonyms and stems for extracted information items leveraging lexical knowledge bases for the English language such as WordNet; linking extracted entities to external knowledge bases such as Google Knowledge Graph and Wikidata; discovering similarity among the extracted information items, such as calculating similarity between string and numbers; classifying, sorting and categorizing data into various types, forms or any other distinct class; and indexing structured and unstructured data - into their data applications. These services can be accessed via a REST API, and the data is returned as a JSON file that can be integrated into data applications. The curation APIs are available as an open source project on GitHub.
Computers & Security | 2014
Mohammad Allahbakhsh; Aleksandar Ignjatovic; Boualem Benatallah; Seyed-Mehdi-Reza Beheshti; Norman Y. Foo; Elisa Bertino
Social rating systems are subject to unfair evaluations. Users may try to individually or collaboratively promote or demote a product. Detecting unfair evaluations, mainly massive collusive attacks as well as honest looking intelligent attacks, is still a real challenge for collusion detection systems. In this paper, we study the impact of unfair evaluations in online rating systems. First, we study the individual unfair evaluations and their impact on the reputation of people calculated by social rating systems. We then propose a method for detecting collaborative unfair evaluations, also known as collusion. The proposed model uses frequent itemset mining technique to detect the candidate collusion groups and sub-groups. We use several indicators to identify collusion groups and to estimate how destructive such colluding groups can be. The approaches presented in this paper have been implemented in prototype tools, and experimentally validated on synthetic and real-world datasets.