Sagi Manole | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sagi Manole is active.

Explore More

Publication

Featured researches published by Sagi Manole.

convention of electrical and electronics engineers in israel | 2010

Workload optimization of proteomics pattern matching using embedded accelerator

Sagi Manole; Amit Golander; Shlomo Weiss

Digitalization has brought a tremendous momentum to the healthcare research. Recognition of patterns in one protein, which are similar to a functional site of another, is crucial for identifying possible functions of newly discovered proteins, as well as analysis of known proteins for previously undetermined activity. PROSITE is a comprehensive database which describes protein domains, families, functional sites and patterns to identify them. In this paper the workload is the task of locating patterns from the PROSITE database over proteins. We optimize the the workload by using IBMs new Power Edge of Network processor (PowerEN) regular expression (RegX) hardware accelerator, which was built for deep-packet inspection at multiple 10Gbps ports. Our preliminary results demonstrate a speedup of 240 relative to software pattern matching. Moreover, indications show that speedup an order of magnitude higher is achievable.

acm international conference on systems and storage | 2017

Persistent memory over fabric (PMoF)

Amit Golander; Sagi Manole; Yigal Korman

Persistent Memory (PM) is an emerging family of technologies that are: persistent; byte addressable; and respond in near-memory speeds. PM devices, also referred to as non-volatile DIMMs or NVDIMMs, connect to the low-latency CPU memory interconnect. PM-based solutions can achieve local persistency within a micro second, which is two orders-of-magnitude faster compared to modern Flash solutions [1]. PM is the first storage media that is faster than high-speed networks and faster than operating system thread scheduling time. Thus, current PM-based solutions are local. They do not comply with common enterprise practices, which require that data remains available even in the face of a given amount of failures, such as an entire node crash. This work focuses on replicating PM resident data sets between nodes, which is vital in order for PM to become mainstream. We leverage RDMA-supporting network gear and the first application-agnostic PM-based file system that supports mirroring (Plexistor M1FS 3.0). The server on the left-hand side of Figure 1 is the application server running the benchmarks and the server on the right-hand side runs the PM-over-Fabric (PMoF) which owns the secondary copy of the data alongside the file system meta data required in order to mount the file system after a failure occurs. The experimental setup uses commodity off-the-shelf hardware and 100GbE (RoCE) network. We first explore a synthetic benchmark (FIO) and then a TPC-C like benchmark (DBT-2) on top of a Postgres database. In both cases, we use work sets that fit in the PM tier, because we do not want tiering to mask the performance implications of mirroring. Figure 2a shows the overall latency (as seen by the application) as a function of different stress levels. Three different access sizes were measured, as well as single and multithread flavors. Small accesses, including local persistency and asynchronous mirroring to the second node, were measured to complete within 1--2 micro seconds under typical storage consumption. At very high loads hardware resources become congested and latency soars. Figure 2b reveals results for similar benchmarks, with one important difference - write requests are synchronous (i.e. fopen with OSYNC=1). These semantics mean that the file system has to also guarantee that the data written and the meta data describing it have reached the PMoF node prior to acknowledging the write system call. Synchronous mirroring is nearly 2.5us slower than asynchronous mirroring for typical loads, which is mostly due to the round trip delay. These results are an order of magnitude faster than modern block-based replication solutions. Databases may support replication at the database layer, as an alternative to maintaining data redundancy at the storage layer. Each approach has its advantages, but the rule of thumb for PostgreSQL, as presented in PGCon IL 2017, anticipates 50% lower transactions per second for having a secondary copy. Figures 3a and 3b reveal the negligible performance overhead that PMoF based mirroring has on real life applications. Compared to Postgres on a single node deployment, transaction rate and response time were measured to be only 2.0 to 2.2% lower.

acm international conference on systems and storage | 2013

Leveraging predefined huffman dictionaries for high compression rate and ratio

Amit Golander; Shai Taharlev; Lior Glass; Giora Biran; Sagi Manole

The explosion of data, both in motion and in rest, along with the popularity of cloud computing, have resulted in the need for better in-line compression solutions. Current Huffman coding modes are optimal for a single metric: compression ratio (quality) or rate (performance). The ratio-focused mode, for example, further compresses the data by 15% as compared to the rate-focused mode, but takes 20% longer to execute. In this paper, we show how to balance the tradeoff between compression ratio and rate, without modifying existing standards and legacy decompression implementations. We present two Huffman encoding heuristics that achieve close-to-optimal compression rate and ratio (within 2%). Our proposed heuristics are practical - the first is better suited for hardware while the second for software implementations. We believe that such in-line compression heuristics can help enable the continued advancement of cloud computing and storage.

IEEE Transactions on Computers | 2014

Protein Sequence Pattern Matching: Leveraging Application Specific Hardware Accelerators

Sagi Manole; Amit Golander; Shlomo Weiss

Digitalization has brought a tremendous momentum to health care research. Recognition of patterns in proteins is crucial for identifying possible functions of newly discovered proteins, as well as analysis of known proteins for previously undetermined activity. In this paper, the workload consists of locating patterns from the PROSITE database in protein sequences. We optimize the pattern search task by using a new breed of processors that merge network and server attributes. We leverage massive multithreading and regular-expression (RegX) hardware accelerators; the latter were designed and built for an entirely different application - high-bandwidth deep-packet inspection. Our multithreading optimization achieves 18x improvement, but by harnessing a RegX accelerator we were able to further demonstrate a significant 392x improvement relative to software pattern matching. Moreover, performance per area and power consumption are improved by multiple orders of magnitude as well.

data compression conference | 2013

High Compression Rate and Ratio Using Predefined Huffman Dictionaries

Amit Golander; Shai I. Tahar; Lior Glass; Giora Biran; Sagi Manole

Current Huffman coding modes are optimal for a single metric: compression ratio (quality) or rate (performance). We recognize that real life data can usually be classified to families of data types and thus the Huffman dictionary can be reused instead of recalculated. In this paper, we show how to balance the trade-off between compression ratio and rate, without modifying existing standards and legacy decompression implementations.

Archive | 2017