Mohamed Mohamedin
Virginia Tech
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohamed Mohamedin.
acm sigplan symposium on principles and practice of parallel programming | 2016
Mohamed Mohamedin; Roberto Palmieri; Sebastiano Peluso; Binoy Ravindran
NUMA architectures posed the challenge of rethinking parallel applications due to the non-homogeneity introduced by their design, and their real benefits are limited to the characteristics of the particular workload. We name as partitionable transactional workloads such workloads that may be able to exploit the distributed nature of NUMA, such as transactional workloads where data and accesses can be easily partitioned among the so called NUMA zones. However, in case those workloads require the synchronization on shared data, we have to face the issue of exploiting the NUMA architecture also in the concurrency control for their transactions. Therefore in this paper we present a NUMA-aware concurrency control for transactional memory that we designed for promoting scalability in scenarios where both the transactional workload is prone to scale, and the characteristics of the underlying memory model are inherently non-uniform, such as NUMA architectures.
acm symposium on parallel algorithms and architectures | 2015
Mohamed Mohamedin; Roberto Palmieri; Ahmed Hassan; Binoy Ravindran
The first release of hardware transactional memory (HTM) as commodity processor posed the question of how to efficiently handle its best-effort nature. In this paper we present Part-HTM, the first hybrid transactional memory protocol that solves the problem of transactions aborted due to the resource limitations (space/time) of current best-effort HTM. The basic idea of Part-HTM is to partition those transactions into multiple sub-transactions, which can likely be committed in hardware. Due to the eager nature of HTM, we designed a low-overhead software framework to preserve transactions correctness (with and without opacity).
international conference on coordination models and languages | 2013
Mohamed Mohamedin; Binoy Ravindran; Roberto Palmieri
We present ByteSTM, a virtual machine-level Java STM implementation that is built by extending the Jikes RVM. We modify Jikes RVM’s optimizing compiler to transparently support implicit transactions. Being implemented at the VM-level, it accesses memory directly, avoids Java garbage collection overhead by manually managing memory for transactional metadata, and provides pluggable support for implementing different STM algorithms to the VM. Our experimental studies reveal throughput improvement over other non-VM STMs by 6–70% on micro-benchmarks and by 7–60% on macro-benchmarks.
international parallel and distributed processing symposium | 2014
Mohamed Mohamedin; Roberto Palmieri; Binoy Ravindran
Multicore architectures are becoming increasingly prone to soft-errors - i.e., transient faults caused by external physical phenomena such as electric noise and cosmic particle strikes. With increasing core counts, the soft-error rate is growing due to the accelerating transistor density on chips. The impact of these errors on business-critical applications that are being deployed on multicore hardware can be significant. We present an active replication-based approach that fully masks such errors for transactional applications. We partition computational cores, fully replicate objects across partitions, and concurrently execute transactional requests on all partitions, thereby enabling completely local object accesses. Transactional requests are globally ordered and delivered across partitions using optimistic atomic broadcast. Hardware message passing -- an important emerging trend in multicore architectures -- is exploited to mitigate communication costs. We report preliminary results obtained with an implementation of our approach on a 36-core Tilera TILE-Gx hardware, with an on-chip scalable mesh network.
international conference on distributed computing systems | 2015
Mohamed Mohamedin; Roberto Palmieri; Binoy Ravindran
Multicore architectures are increasingly becoming prone to transient faults. In this paper we briefly present Shield, a middleware to provide transactional applications with resiliency to those faults that can happen anytime during the execution of a processor but do not cause any hardware interruption. Shield is inspired by the state machine replication approach, where computational resources are partitioned, the shared state is fully replicated, and requests are executed by all partitions in the same order. Shield embeds a set of algorithmic and system innovations to limit the overhead with respect to non-fault-tolerant solutions. They include a fast total order layer that lets application threads and computational nodes co-operate in order to fast deliver.
acm symposium on parallel algorithms and architectures | 2015
Mohamed Mohamedin; Roberto Palmieri; Binoy Ravindran
This paper shows the issues to face while designing contention management policies that involve best-effort hardware transactions. Also, in this paper we present Octonauts, a solution for scheduling HTM transactions without relying on on-the-fly information. Octonauts learns the objects accessed by a hardware transaction while running and it uses them in case of conflict. It also proposes an innovative scheme for optimizing the communication between transactions running in hardware and software.
international conference on parallel processing | 2018
Mohamed Mohamedin; Sebastiano Peluso; Masoomeh Javidi Kishi; Ahmed Hassan; Roberto Palmieri
In this paper we present Nemo, a NUMA-aware Transactional Memory (TM) design and implementation optimized for promoting scalability in applications running on top of NUMA architectures. Nemo deploys a hybrid design where conflicting threads alternate the usage of single timestamps and vector clocks to identify inconsistent executions depending upon the source of conflict. We assessed the performance of Nemo by using both synthetic and well-known OLTP transactional workloads. Our approach offers improvements over the six state-of-the-art competitors we implemented.
IEEE Transactions on Parallel and Distributed Systems | 2017
Mohamed Mohamedin; Roberto Palmieri; Ahmed Hassan; Binoy Ravindran
The first release of hardware transactional memory (HTM) as commodity processor posed the question of how to efficiently handle its best-effort nature. In this paper we present Part-HTM, a hybrid transactional memory protocol that solves the problem of transactions aborted due to the resource limitations (space/time) of current best-effort HTM. The basic idea of Part-HTM is to partition those transactions into multiple sub-transactions, which can likely be committed in hardware. Due to the eager nature of HTM, we designed a low-overhead software framework to preserve transaction’s correctness (with and without opacity) and isolation. Part-HTM is effective: our evaluation study confirms that its performance is the best in all tested cases, except for those where HTM cannot be outperformed. However, in such a workload, Part-HTM still performs better than all other software and hybrid competitors.
network computing and applications | 2014
Mohamed Mohamedin; Roberto Palmieri; Binoy Ravindran
Multicore architectures are becoming increasingly prone to transient faults and data corruption. Relying on a multicore architecture is the common solution for increasing performance and scalability of core applications including transactional applications. In this paper we present SoftX, a low-invasive protocol for supporting execution of transactional applications relying on speculative processing and dedicated committer threads. Upon starting a transaction, SoftX forks a number of threads running the same transaction independently. The commit phase is handled by dedicated threads for optimizing synchronizations overhead. We conduct an evaluation study showing the performance obtained with the implementation of SoftX on a 48 cores AMD machine, running List, Bank and TPC-C benchmarks. Results reveal better performance than classical replication-based fault-tolerant systems and limited overhead with respect to non fault-tolerant protocols. We ported SoftX to a message-passing architecture, Tilera TILE-Gx. Hardware message-passing is an important emerging trend in multicore architectures. Our experiments on Tilera show that SoftX is still more efficient than replication.
usenix conference on hot topics in parallelism | 2012
Mohamed M. Saad; Mohamed Mohamedin; Binoy Ravindran