Mohammad Sadoghi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohammad Sadoghi is active.

Explore More

Publication

Featured researches published by Mohammad Sadoghi.

very large data bases | 2012

Solving big data challenges for enterprise application performance management

Tilmann Rabl; Sergio Gómez-Villamor; Mohammad Sadoghi; Victor Muntés-Mulero; Hans-Arno Jacobsen; Serge Mankovskii

As the complexity of enterprise systems increases, the need for monitoring and analyzing such systems also grows. A number of companies have built sophisticated monitoring tools that go far beyond simple resource utilization reports. For example, based on instrumentation and specialized APIs, it is now possible to monitor single method invocations and trace individual transactions across geographically distributed systems. This high-level of detail enables more precise forms of analysis and prediction but comes at the price of high data rates (i.e., big data). To maximize the benefit of data monitoring, the data has to be stored for an extended period of time for ulterior analysis. This new wave of big data analytics imposes new challenges especially for the application performance monitoring systems. The monitoring data has to be stored in a system that can sustain the high data rates and at the same time enable an up-to-date view of the underlying infrastructure. With the advent of modern key-value stores, a variety of data storage systems have emerged that are built with a focus on scalability and high data rates as predominant in this monitoring use case. In this work, we present our experience and a comprehensive performance evaluation of six modern (open-source) data stores in the context of application performance monitoring as part of CA Technologies initiative. We evaluated these systems with data and workloads that can be found in application performance monitoring, as well as, on-line advertisement, power monitoring, and many other use cases. We present our insights not only as performance results but also as lessons learned and our experience relating to the setup and configuration complexity of these data stores in an industry setting.

international conference on management of data | 2007

Benchmarking declarative approximate selection predicates

Amit Chandel; Oktie Hassanzadeh; Nick Koudas; Mohammad Sadoghi; Divesh Srivastava

Declarative data quality has been an active research topic. The fundamental principle behind a declarative approach to data quality is the use of declarative statements to realize data quality primitives on top of any relational data source. A primary advantage of such an approach is the ease of use and integration with existing applications. Over the last few years several similarity predicates have been proposed for common quality primitives (approximate selections, joins, etc) and have been fully expressed using declarative SQL statements. In this paper we propose new similarity predicates along with their declarative realization, based on notions of probabilistic information retrieval. In particular we show how language models and hidden Markov models can be utilized as similarity predicates for data quality and present their full declarative instantiation. We also show how other scoring methods from information retrieval, can be utilized in a similar setting. We then present full declarative specifications of previously proposed similarity predicates in the literature, grouping them into classes according to their primary characteristics. Finally, we present a thorough performance and accuracy study comparing a large number of similarity predicates for data cleaning operations. We quantify both their runtime performance as well as their accuracy for several types of common quality problems encountered in operational databases.

very large data bases | 2010

Efficient event processing through reconfigurable hardware for algorithmic trading

Mohammad Sadoghi; Martin Labrecque; Harsh Singh; Warren Shum; Hans-Arno Jacobsen

In this demo, we present fpga-ToPSS (Toronto Publish/Subscribe System Family), an efficient event processing platform for high-frequency and low-latency algorithmic trading. Our event processing platform is built over reconfigurable hardware---FPGAs---to achieve line-rate processing. Furthermore, our event processing engine supports Boolean expression matching with an expressive predicate language that models complex financial strategies to autonomously buy and sell stocks based on real-time financial data.

international conference on management of data | 2011

BE-tree: an index structure to efficiently match boolean expressions over high-dimensional discrete space

Mohammad Sadoghi; Hans-Arno Jacobsen

BE-Tree is a novel dynamic tree data structure designed to efficiently index Boolean expressions over a high-dimensional discrete space. BE-Tree copes with both high-dimensionality and expressiveness of Boolean expressions by introducing a novel two-phase space-cutting technique that specifically utilizes the discrete and finite domain properties of the space. Furthermore, BE-Tree employs self-adjustment policies to dynamically adapt the tree as the workload changes. We conduct a comprehensive evaluation to demonstrate the superiority of BE-Tree in comparison with state-of-the-art index structures designed for matching Boolean expressions.

international conference on data engineering | 2012

Multi-query Stream Processing on FPGAs

Mohammad Sadoghi; Rija Javed; Naif Tarafdar; Harsh Singh; Rohan Palaniappan; Hans-Arno Jacobsen

We present an efficient multi-query event stream platform to support query processing over high-frequency event streams. Our platform is built over reconfigurable hardware -- FPGAs -- to achieve line-rate multi-query processing by exploiting unprecedented degrees of parallelism and potential for pipelining, only available through custom-built, application-specific and low-level logic design. Moreover, a multi-query event stream processing engine is at the core of a wide range of applications including real-time data analytics, algorithmic trading, targeted advertisement, and (complex) event processing.

data management on new hardware | 2011

Towards highly parallel event processing through reconfigurable hardware

Mohammad Sadoghi; Harsh Singh; Hans-Arno Jacobsen

We present fpga-ToPSS (Toronto Publish/Subscribe System), an efficient event processing platform to support high-frequency and low-latency event matching. fpga-ToPSS is built over reconfigurable hardware---FPGAs---to achieve line-rate processing by exploring various degrees of parallelism. Furthermore, each of our proposed FPGA-based designs is geared towards a unique application requirement, such as flexibility, adaptability, scalability, or pure performance, such that each solution is specifically optimized to attain a high level of parallelism. Therefore, each solution is formulated as a design trade-off between the degree of parallelism versus the desired application requirement. Moreover, our event processing engine supports Boolean expression matching with an expressive predicate language applicable to a wide range of applications including real-time data analysis, algorithmic trading, targeted advertisement, and (complex) event processing.

very large data bases | 2013

Flexible query processor on FPGAs

Mohammadreza Najafi; Mohammad Sadoghi; Hans-Arno Jacobsen

In this work, we demonstrate Flexible Query Processor (FQP), an online reconfigurable event stream query processor. FQP is an FPGA-based query processor that supports select, project and join queries over event streams at line rate. While processing incoming events, FQP can accept new query expressions, a key distinguishing characteristic from related approaches employing FPGAs for acceleration. Our solution aims to address performance limitations experienced with general purpose processors needing to operate at line rate and lack of on the fly reconfigurability with custom designed hardware solutions on FPGAs.

international conference on data engineering | 2012

Relevance Matters: Capitalizing on Less (Top-k Matching in Publish/Subscribe)

Mohammad Sadoghi; Hans-Arno Jacobsen

The efficient processing of large collections of Boolean expressions plays a central role in major data intensive applications ranging from user-centric processing and personalization to real-time data analysis. Emerging applications such as computational advertising and selective information dissemination demand determining and presenting to an end-user only the most relevant content that is both user-consumable and suitable for limited screen real estate of target devices. To retrieve the most relevant content, we present BE*-Tree, a novel indexing data structure designed for effective hierarchical top-k pattern matching, which as its by-product also reduces the operational cost of processing millions of patterns. To further reduce processing cost, BE*-Tree employs an adaptive and non-rigid space-cutting technique designed to efficiently index Boolean expressions over a high-dimensional continuous space. At the core of BE*-Tree lie two innovative ideas: (1) a bi-directional tree expansion build as a top-down (data and space clustering) and a bottom-up growths (space clustering), which together enable indexing only non-empty continuous sub-spaces, and (2) an overlap-free splitting strategy. Finally, the performance of BE*-Tree is proven through a comprehensive experimental comparison against state-of-the-art index structures for matching Boolean expressions.

very large data bases | 2013

Making updates disk-I/O friendly using SSDs

Mohammad Sadoghi; Kenneth A. Ross; Mustafa Canim; Bishwaranjan Bhattacharjee

Multiversion databases store both current and historical data. Rows are typically annotated with timestamps representing the period when the row is/was valid. We develop novel techniques for reducing index maintenance in multiversion databases, so that indexes can be used effectively for analytical queries over current data without being a heavy burden on transaction throughput. To achieve this end, we re-design persistent index data structures in the storage hierarchy to employ an extra level of indirection. The indirection level is stored on solid state disks that can support very fast random I/Os, so that traversing the extra level of indirection incurs a relatively small overhead. The extra level of indirection dramatically reduces the number of magnetic disk I/Os that are needed for index updates, and localizes maintenance to indexes on updated attributes. Further, we batch insertions within the indirection layer in order to reduce physical disk I/Os for indexing new records. By reducing the index maintenance overhead on transactions, we enable operational data stores to create more indexes to support queries. We have developed a prototype of our indirection proposal by extending the widely used Generalized Search Tree (GiST) open-source project, which is also employed in PostgreSQL. Our working implementation demonstrates that we can significantly reduce index maintenance and/or query processing cost, by a factor of 3. For insertions of new records, our novel batching technique can save up to 90% of the insertion time.

ACM Transactions on Database Systems | 2013

Analysis and optimization for boolean expression indexing

Mohammad Sadoghi; Hans-Arno Jacobsen

BE-Tree is a novel dynamic data structure designed to efficiently index Boolean expressions over a high-dimensional discrete space. BE Tree-copes with both high-dimensionality and expressiveness of Boolean expressions by introducing an effective two-phase space-cutting technique that specifically utilizes the discrete and finite domain properties of the space. Furthermore, BE-Tree employs self-adjustment policies to dynamically adapt the tree as the workload changes. Moreover, in BE-Tree, we develop two novel cache-conscious predicate evaluation techniques, namely, lazy and bitmap evaluations, that also exploit the underlying discrete and finite space to substantially reduce BE-Trees matching time by up to 75% BE-Tree is a general index structure for matching Boolean expression which has a wide range of applications including (complex) event processing, publish/subscribe matching, emerging applications in cospaces, profile matching for targeted web advertising, and approximate string matching. Finally, the superiority of BE-Tree is proven through a comprehensive evaluation with state-of-the-art index structures designed for matching Boolean expressions.

Explore More