Dawood Tariq
SRI International
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dawood Tariq.
international middleware conference | 2012
Ashish Gehani; Dawood Tariq
SPADE is an open source software infrastructure for data provenance collection and management. The underlying data model used throughout the system is graph-based, consisting of vertices and directed edges that are modeled after the node and relationship types described in the Open Provenance Model. The system has been designed to decouple the collection, storage, and querying of provenance metadata. At its core is a novel provenance kernel that mediates between the producers and consumers of provenance information, and handles the persistent storage of records. It operates as a service, peering with remote instances to enable distributed provenance queries. The provenance kernel on each host handles the buffering, filtering, and multiplexing of incoming metadata from multiple sources, including the operating system, applications, and manual curation. Provenance elements can be located locally with queries that use wildcard, fuzzy, proximity, range, and Boolean operators. Ancestor and descendant queries are transparently propagated across hosts until a terminating expression is satisfied, while distributed path queries are accelerated with provenance sketches.
acm symposium on applied computing | 2011
Dawood Tariq; Basim Baig; Ashish Gehani; Salman Mahmood; Rashid Tahir; Azeem Aqil; Fareed Zaffar
Identifying when anomalous activity is correlated in a distributed system is useful for a range of applications from intrusion detection to tracking quality of service. The more specific the logs, the more precise the analysis they allow. However, collecting detailed logs from across a distributed system can deluge the network fabric. We present an architecture that allows fine-grained auditing on individual hosts, space-efficient representation of anomalous activity that can be centrally correlated, and tracing anomalies back to individual files and processes in the system. A key contribution is the design of an anomaly-provenance bridge that allows opaque digests of anomalies to be mapped back to their associated provenance.
Archive | 2013
Tanu Malik; Ashish Gehani; Dawood Tariq; Fareed Zaffar
Users can determine the precise origins of their data by collecting detailed provenance records. However, auditing at a finer grain produces large amounts of metadata. To efficiently manage the collected provenance, several provenance management systems, including SPADE, record provenance on the hosts where it is generated. Distributed provenance raises the issue of efficient reconstruction during the query phase. Recursively querying provenance metadata or computing its transitive closure is known to have limited scalability and cannot be used for large provenance graphs. We present matrix filters, which are novel data structures for representing graph information, and demonstrate their utility for improving query efficiency with experiments on provenance metadata gathered while executing distributed workflow applications.
international conference on security and privacy in communication systems | 2015
Chao Yang; Guangliang Yang; Ashish Gehani; Vinod Yegneswaran; Dawood Tariq; Guofei Gu
We propose Dagger, a lightweight system to dynamically vet sensitive behaviors in Android apps. Dagger avoids costly instrumentation of virtual machines or modifications to the Android kernel. Instead, Dagger reconstructs the program semantics by tracking provenance relationships and observing apps’ runtime interactions with the phone platform. More specifically, Dagger uses three types of low-level execution information at runtime: system calls, Android Binder transactions, and app process details. System call collection is performed via Strace [7], a low-latency utility for Linux and other Unix-like systems. Binder transactions are recorded by accessing Binder module logs via sysfs [8]. App process details are extracted from the Android /proc file system [6]. A data provenance graph is then built to record the interactions between the app and the phone system based on these three types of information. Dagger identifies behaviors by matching the provenance graph with the behavior graph patterns that are previously extracted from the internal working logic of the Android framework. We evaluate Dagger on both a set of over 1200 known malicious Android apps, and a second set of 1000 apps randomly selected from a corpus of over 18,000 Google Play apps. Our evaluation shows that Dagger can effectively vet sensitive behaviors in apps, especially for those using complex obfuscation techniques. We measured the overhead based on a representative benchmark app, and found that both the memory and CPU overhead are less than 10%. The runtime overhead is less than 63%, which is significantly lower than that of existing approaches.
ieee international symposium on policies for distributed systems and networks | 2011
Ashish Gehani; Dawood Tariq; Basim Baig; Tanu Malik
Reproducibility has been a cornerstone of the scientific method for hundreds of years. The range of sources from which data now originates, the diversity of the individual manipulations performed, and the complexity of the orchestrations of these operations all limit the reproducibility that a scientist can ensure solely by manually recording their actions. We use an architecture where aggregation, fusion, and composition policies define how provenance records can be automatically merged to facilitate the analysis and reproducibility of experiments. We show that the overhead of collecting and storing provenance metadata can vary dramatically depending on the policy used to integrate it.
hawaii international conference on system sciences | 2013
Hasnain Lakhani; Rashid Tahir; Azeem Aqil; Fareed Zaffar; Dawood Tariq; Ashish Gehani
Large data processing tasks can be effected using workflow management systems. When either the input data or the programs in the pipeline are modified, the workflow must be re-executed to ensure that the final output data is updated to reflect the changes. Since such re-computation can consume substantial resources, optimizing the system to avoid redundant computation is desirable. In the case of a workflow, the dependency relationships between files are specified at the outset and can be leveraged to track which programs need to be re-executed when particular files change. Current distributed systems cannot provide such functionality when no predefined workflows exist. In this paper, we present an architecture that provides functionality to produce both correct output as well as fast re-execution by leveraging the provenance of data to propagate changes along an implicit dependency graph. We explore the tradeoff between storage and availability by presenting a performance analysis of our rollback and re-execution scheme.
acm symposium on applied computing | 2013
Minyoung Kim; Ashish Gehani; Je-Min Kim; Dawood Tariq; Mark-Oliver Stehr; Jin-Soo Kim
Emerging applications such as search-and-rescue operations, CNS (communication, navigation, surveillance), smart spaces, vehicular networks, mission-critical infrastructure, and disaster control require reliable content distribution under harsh network conditions and all kinds of component failures. In such scenarios, potentially heterogeneous networked components --- where the networks lack reliable connections --- need to be managed to improve scalability, performance, and availability of the overall system. Inspired by delay- and disruption-tolerant networking, this paper presents a distributed cross-layer monitoring and optimization method for secure content delivery as a first step toward decentralized content-based mobile ad hoc networking. In particular, we address the availability maximization problem by embedding monitoring and optimization within an existing content-distribution framework. The implications of policies at security, caching, and hardware layers that control in-network storage and hop-by-hop dissemination of content then are analyzed to maximize the content availability in disruptive environments. Additional benefits can be obtained by optimizing the control based on continuously observing the response to anomalies caused by cyber-attacks. For example, if excessive (potentially fraudulent) content is injected, the content distribution system can adapt without significantly compromising the availability.
TaPP'12 Proceedings of the 4th USENIX conference on Theory and Practice of Provenance | 2012
Dawood Tariq; Maisem Ali; Ashish Gehani
TaPP '13 Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance | 2013
Nathaniel Husted; Sharjeel Qureshi; Dawood Tariq; Ashish Gehani
edbt icdt workshops | 2013
Ashish Gehani; Dawood Tariq