Is this you? Create Your Porfile

Abdullah Muzahid

University of Texas at San Antonio

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Abdullah Muzahid is active.

Explore More

Publication

Featured researches published by Abdullah Muzahid.

international symposium on computer architecture | 2009

SigRace: signature-based data race detection

Abdullah Muzahid; Dario Suarez; Shanxiang Qi; Josep Torrellas

Detecting data races in parallel programs is important for both software development and production-run diagnosis. Recently, there have been several proposals for hardware-assisted data race detection. Such proposals typically modify the L1 cache and cache coherence protocol messages, and largely lose their capability when lines get displaced or invalidated from the cache. To eliminate these shortcomings, this paper proposes a novel, different approach to hardware-assisted data race detection. The approach, called SigRace, relies on hardware address signatures. As a processor runs, the addresses of the data that it accesses are automatically encoded in signatures. At certain times, the signatures are automatically passed to a hardware module that intersects them with those of other processors. If the intersection is not null, a data race may have occurred. This paper presents the architecture of SigRace, an implementation, and its software interface. With SigRace, caches and coherence protocol messages are unmodified. Moreover, cache lines can be displaced and invalidated with no effect. Our experiments show that SigRace is significantly more effective than a state-of-the-art conventional hardware-assisted race detector. SigRace finds on average 29% more static races and 107% more dynamic races. Moreover, if we inject data races, SigRace finds 150% more static races than the conventional scheme.

international symposium on microarchitecture | 2010

AtomTracker: A Comprehensive Approach to Atomic Region Inference and Violation Detection

Abdullah Muzahid; Norimasa Otsuki; Josep Torrellas

A particularly insidious type of concurrency bug is atomicity violations. While there has been substantial work on automatic detection of atomicity violations, each existing technique has focused on a certain type of atomic region. To address this limitation, this paper presents Atom Tracker, a comprehensive approach to atomic region inference and violation detection. Atom Tracker is the first scheme to (1) automatically infer generic atomic regions (not limited by issues such as the number of variables accessed, the number of instructions included, or the type of code construct the region is embedded in) and (2) automatically detect violations of them at runtime with negligible execution overhead. Atom Tracker provides novel algorithms to infer generic atomic regions and to detect atomicity violations of them. Moreover, we present a hardware implementation of the violation detection algorithm that leverages cache coherence state transitions in a multiprocessor. In our evaluation, we take eight atomicity violation bugs from real-world codes like Apache, MySql, and Mozilla, and show that Atom Tracker detects them all. In addition, Atom Tracker automatically infers all of the atomic regions in a set of micro benchmarks accurately. Finally, we also show that the hardware implementation induces a negligible execution time overhead of 0.2–4.0% and, therefore, enables Atom Tracker to find atomicity violations on-the-fly in production runs.

international symposium on microarchitecture | 2012

Vulcan: Hardware Support for Detecting Sequential Consistency Violations Dynamically

Abdullah Muzahid; Shanxiang Qi; Josep Torrellas

Past work has focused on detecting data races as proxies for Sequential Consistency (SC) violations. However, most data races do not violate SC. In addition, lock-free data structures and synchronization libraries sometimes explicitly employ data races but rely on SC semantics for correctness. Consequently, to uncover SC violations, we need to develop a more precise technique. This paper presents Vulcan, the first hardware scheme to precisely detect SC violations at runtime, in programs running on a relaxed-consistency machine. The scheme leverages cache coherence protocol transactions to dynamically detect cycles in memory access orders across threads. When one such cycle is about to occur, an exception is triggered. For the conditions considered in this paper and with enough hardware, Vulcan suffers neither false positives nor false negatives. In addition, Vulcan induces negligible execution overhead, requires no help from the software, and only takes as input the program executable. Experimental results show that Vulcan detects three new SC violation bugs in the Pthread and Crypt libraries, and in the fmm code from SPLASH-2. Moreover, Vulcans negligible execution overhead makes it suitable for on-the-fly use.

high-performance computer architecture | 2014

Dynamically detecting and tolerating IF-Condition Data Races

Shanxiang Qi; Abdullah Muzahid; Wonsun Ahn; Josep Torrellas

An IF-Condition Invariance Violation (ICIV) occurs when, after a thread has computed the control expression of an IF statement and while it is executing the THEN or ELSE clauses, another thread updates variables in the IFs control expression. An ICIV can be easily detected, and is likely to be a sign of a concurrency bug in the code. Typically, the ICIV is caused by a data race, which we call IF-Condition Data Race (ICR). In this paper, we analyze the data races reported in the bug databases of popular software systems and show that ICRs occur relatively often. Then, we present two techniques to handle ICRs dynamically. They rely on simple code transformations and, in one case, additional hardware help. One of them (SW-IF) detects the races, while the other (HW-IF) detects and prevents them. We evaluate SW-IF and HW-IF using a variety of applica- tions. We show that these new techniques are effective at finding new data race bugs and run with low overhead. Specifically, HW-IF finds 5 new (unreported) race bugs and SW-IF finds 3 of them. In addition, 8-threaded executions of SPLASH-2 codes show that, on average, SW-IF adds 2% execution overhead, while HW-IF adds less than 1%.

european conference on computer systems | 2017

SyncPerf: Categorizing, Detecting, and Diagnosing Synchronization Performance Bugs

Mohammad Mejbah ul Alam; Tongping Liu; Guangming Zeng; Abdullah Muzahid

Despite the obvious importance, performance issues related to synchronization primitives are still lacking adequate attention. No literature extensively investigates categories, root causes, and fixing strategies of such performance issues. Existing work primarily focuses on one type of problems, while ignoring other important categories. Moreover, they leave the burden of identifying root causes to programmers. This paper first conducts an extensive study of categories, root causes, and fixing strategies of performance issues related to explicit synchronization primitives. Based on this study, we develop two tools to identify root causes of a range of performance issues. Compare with existing work, our proposal, SyncPerf, has three unique advantages. First, SyncPerfs detection is very lightweight, with 2.3% performance overhead on average. Second, SyncPerf integrates information based on callsites, lock variables, and types of threads. Such integration helps identify more latent problems. Last but not least, when multiple root causes generate the same behavior, SyncPerf provides a second analysis tool that collects detailed accesses inside critical sections and helps identify possible root causes. SyncPerf discovers many unknown but significant synchronization performance issues. Fixing them provides a performance gain anywhere from 2.5% to 42%. Low overhead, better coverage, and informative reports make SyncPerf an effective tool to find synchronization performance bugs in the production environment.

international symposium on software reliability engineering | 2016

Approximate Lock: Trading off Accuracy for Performance by Skipping Critical Sections

Riad Akram; Mohammad Mejbah ul Alam; Abdullah Muzahid

Approximate computing is gaining a lot of traction due to its potential for improving performance and consequently, energy efficiency. This project explores the potential for approximating locks. We start out with the observation that many applications can tolerate occasional skipping of computations done inside a critical section protected by a lock. This means that for certain critical sections, when the enclosed computation is occasionally skipped, the application suffers from quality degradation in the final outcome but it never crashes/deadlocks. To exploit this opportunity, we propose Approximate Lock (ALock). The thread executing ALock checks if a certain condition (e.g., high contention, long waiting time) is met and if so, the thread returns without acquiring the lock. We modify some selected critical sections using ALock so that those sections are skipped when ALock returns without acquiring the lock. We experimented with 14 programs from PARSEC, SPLASH2, and STAMP benchmarks. We found a total of 37 locks that can be transformed into ALock. ALock provides performance improvement for 10 applications, ranging from 1.8% to 164.4%, with at least 80% accuracy.

international symposium on software reliability engineering | 2016

Detecting, Exposing, and Classifying Sequential Consistency Violations

Mohammad Majharul Islam; Abdullah Muzahid

Sequential Consistency (SC) is the most intuitive memory model for parallel programs. However, modern architectures aggressively reorder and overlap memory accesses, causing SC violations. An SC violation is virtually always a bug. Most prior schemes either search the entire state space of a program, or use a constraint solver to find SC violations. A promising recent scheme uses active testing technique but fails to be effective for SC violations involving larger number of threads and variables, and larger codebases. We propose Orion, the first active testing technique that can detect, expose, and classify any arbitrary SC violations in any program. Orion works in two phases. In the first phase, it finds potential SC violation cycles by focusing on racing accesses. In the second phase, it exposes each SC violation cycle by enforcing the exact scheduling order. We present a detailed design of Orion in the paper. We tested different concurrent algorithms, bug kernels, SPLASH2, PARSEC applications, and an open source program, Apache. We experimented with TSO and PSO memory models. We detected and exposed 60 SC violations of which 15 violations involve more than two processors and variables. Orion exposes SC violations quickly and with high probability. Compared to a state-of-the-art active testing technique, it has a much better SC violation detection ability.

international symposium on computer architecture | 2016

Production-run software failure diagnosis via a daptive c ommunication t racking

Mohammad Mejbah ul Alam; Abdullah Muzahid

Software failure diagnosis techniques work either by sampling some events at production-run time or by using some bug detection algorithms. Some of the techniques require the failure to be reproduced multiple times. The ones that do not require such, are not adaptive enough when the execution platform, environment or code changes. We propose ACT, a diagnosis technique for production-run failures, that uses the machine intelligence of neural hardware. ACT learns some invariants (e.g., data communication invariants) on-the-fly using the neural hardware and records any potential violation of them. Since ACT can learn invariants on-the-fly, it can adapt to any change in execution setting or code. Since it records only the potentially violated invariants, the postprocessing phase can pinpoint the root cause fairly accurately without requiring to observe the failure again. ACT works seamlessly for many sequential and concurrency bugs. The paper provides a detailed design and implementation of ACT in a typical multiprocessor system. It uses a three stage pipeline for partially configurable one hidden layer neural networks. We have evaluated ACT on a variety of programs from popular benchmarks as well as open source programs. ACT diagnoses failures caused by 16 bugs from these programs with accurate ranking. Compared to existing learning and sampling based approaches, ACT has better diagnostic ability. For the default configuration, ACT has an average execution overhead of 8.2%.

ieee international symposium on workload characterization | 2017

Approximeter: Automatically finding and quantifying code sections for approximation

Riad Akram; Abdullah Muzahid

Approximate computing is getting a lot of traction especially for its potential in improving power, performance, and scalability of a computing system. However, prior work heavily relies upon a programmer to identify code sections where various approximation techniques can be applied. Such an approach is error prone and cannot scale well beyond small applications. In this paper, we contribute with a tool, called Approximeter, to automatically identify and quantify code sections where approximation can be used and to what extant. The tool works by first identifying potential approximable functions and then, injecting errors at appropriate locations. The tool runs Monte Carlo experiments to quantify statistical relation between injected error and corresponding output accuracy. The tool also provides a rough estimate of potential performance gain from approximating a certain function. Finally, it ranks the approximable functions based on their error tolerance and performance gain.

international conference on algorithms and architectures for parallel processing | 2016

Hardware-based sequential consistency violation detection made simpler

Mohammad Majharul Islam; Riad Akram; Abdullah Muzahid

Sequential Consistency (SC) is the most intuitive memory model for parallel programs. However, modern architectures aggressively reorder and overlap memory accesses, causing SC violations (SCVs). An SCV is practically always a bug. This paper proposes Dissector, a hardware software combined approach to detect SCVs in a conventional TSO machine. Dissector hardware works by piggybacking information about pending stores with cache coherence messages. Later, it detects if any of those pending stores can cause an SCV cycle. Dissector keeps hardware modifications minimal and simpler by sacrificing some degree of detection accuracy. Dissector recovers the loss in detection accuracy by using a postprocessing software which filters out false positives and extracts detail debugging information. Dissector hardware is lightweight, keeps the cache coherence protocol clean, does not generate any extra messages, and is unaffected by branch mispredictions. Moreover, due to the postprocessing phase, Dissector does not suffer from false positives. This paper presents a detailed design and implementation of Dissector in a conventional TSO machine. Our experiments with different concurrent algorithms, bug kernels, Splash2 and Parsec applications show that Dissector has a better SCV detection ability than a state-of-the-art hardware based approach with much less hardware. Dissector hardware induces a negligible execution overhead of 0.02%. Moreover, with more processors, the overhead remains virtually the same.

Explore More