Ahmad Mahmoody | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ahmad Mahmoody is active.

Explore More

Publication

Featured researches published by Ahmad Mahmoody.

Genome Biology | 2013

THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data

Layla Oesper; Ahmad Mahmoody; Benjamin J. Raphael

Tumor samples are typically heterogeneous, containing admixture by normal, non-cancerous cells and one or more subpopulations of cancerous cells. Whole-genome sequencing of a tumor sample yields reads from this mixture, but does not directly reveal the cell of origin for each read. We introduce THetA (Tumor Heterogeneity Analysis), an algorithm that infers the most likely collection of genomes and their proportions in a sample, for the case where copy number aberrations distinguish subpopulations. THetA successfully estimates normal admixture and recovers clonal and subclonal copy number aberrations in real and simulated sequencing data. THetA is available at http://compbio.cs.brown.edu/software/

Bioinformatics | 2014

A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data

Iman Hajirasouliha; Ahmad Mahmoody; Benjamin J. Raphael

Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations into subpopulations according to the observed counts of DNA sequencing reads containing the variant allele. However, these clustering approaches do not consider that the population frequencies of different tumor subpopulations are correlated by their shared ancestry in the same population of cells. Results: We introduce the binary tree partition (BTP), a novel combinatorial formulation of the problem of constructing the subpopulations of tumor cells from the variant allele frequencies of somatic mutations. We show that finding a BTP is an NP-complete problem; derive an approximation algorithm for an optimization version of the problem; and present a recursive algorithm to find a BTP with errors in the input. We show that the resulting algorithm outperforms existing clustering approaches on simulated and real sequencing data. Availability and implementation: Python and MATLAB implementations of our method are available at http://compbio.cs.brown.edu/software/ Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

knowledge discovery and data mining | 2016

Scalable Betweenness Centrality Maximization via Sampling

Ahmad Mahmoody; Charalampos E. Tsourakakis; Eli Upfal

Betweenness centrality (BWC) is a fundamental centrality measure in social network analysis. Given a large-scale network, how can we find the most central nodes? This question is of great importance to many key applications that rely on BWC, including community detection and understanding graph vulnerability. Despite the large amount of work on scalable approximation algorithm design for BWC, estimating BWC on large-scale networks remains a computational challenge. In this paper, we study the Centrality Maximization problem (CMP): given a graph G = (V,E) and a positive integer k, find a set S* ⊆ V that maximizes BWC subject to the cardinality constraint |S*| ≤ k. We present an efficient randomized algorithm that provides a (1 -- 1/e -- ε)-approximation with high probability, where ε > 0. Our results improve the current state-of-the-art result [40]. Furthermore, we provide the first theoretical evidence for the validity of a crucial assumption in betweenness centrality estimation, namely that in real-world networks O(|V|2) shortest paths pass through the top-k central nodes, where k is a constant. This also explains why our algorithm runs in near linear time on real-world networks. We also show that our algorithm and analysis can be applied to a wider range of centrality measures, by providing a general analytical framework. On the experimental side, we perform an extensive experimental analysis of our method on real-world networks, demonstrate its accuracy and scalability, and study different properties of central nodes. Then, we compare the sampling method used by the state-of-the-art algorithm with our method. Furthermore, we perform a study of BWC in time evolving networks, and see how the centrality of the central nodes in the graphs changes over time. Finally, we compare the performance of the stochastic Kronecker model [28] to real data, and observe that it generates a similar growth pattern.

research in computational molecular biology | 2013

Inferring intra-tumor heterogeneity from high-throughput DNA sequencing data

Layla Oesper; Ahmad Mahmoody; Benjamin J. Raphael

Cancer is a disease driven in part by somatic mutations that accumulate during the lifetime of an individual. The clonal theory [1] posits that the cancerous cells in a tumor are descended from a single founder cell and that descendants of this cell acquired multiple mutations beneficial for tumor growth through rounds of selection and clonal expansion. A tumor is thus a heterogeneous population of cells, with different subpopulations of cells containing both clonal mutations from the founder cell or early rounds of clonal expansion, and subclonal mutations that occurred after the most recent clonal expansion. Most cancer sequencing projects sequence a mixture of cells from a tumor sample including admixture by normal (non-cancerous) cells and different subpopulations of cancerous cells. In addition most solid tumors exhibit extensive aneuploidy and copy number aberrations. Intra-tumor heterogeneity and aneuploidy conspire to complicate analysis of somatic mutations in sequenced tumor samples.

BMC Bioinformatics | 2012

Reconstructing genome mixtures from partial adjacencies.

Ahmad Mahmoody; Crystal L. Kahn; Benjamin J. Raphael

Many cancer genome sequencing efforts are underway with the goal of identifying the somatic mutations that drive cancer progression. A major difficulty in these studies is that tumors are typically heterogeneous, with individual cells in a tumor having different complements of somatic mutations. However, nearly all DNA sequencing technologies sequence DNA from multiple cells, thus resulting in measurement of mutations from a mixture of genomes. Genome rearrangements are a major class of somatic mutations in many tumors, and the novel adjacencies (i.e. breakpoints) resulting from these rearrangements are readily detected from DNA sequencing reads. However, the assignment of each rearrangement, or adjacency, to an individual cancer genome in the mixture is not known. Moreover, the quantity of DNA sequence reads may be insufficient to measure all rearrangements in all genomes in the tumor. Motivated by this application, we formulate the k-minimum completion problem (k-MCP). In this problem, we aim to reconstruct k genomes derived from a single reference genome, given partial information about the adjacencies present in the mixture of these genomes. We show that the 1-MCP is solvable in linear time in the cases where: (i) the measured, incomplete genome has a single circular or linear chromosome; (ii) there are no restrictions on the chromosomal content of the measured, incomplete genome. We also show that the k-MCP problem, for k ≥ 3 in general, and the 2-MCP problem with the double-cut-and-join (DCJ) distance are NP-complete, when there are no restriction on the chromosomal structure of the measured, incomplete genome. These results lay the foundation for future algorithmic studies of the k-MCP and the application of these algorithms to real cancer sequencing data.

web search and data mining | 2016

Wiggins: Detecting Valuable Information in Dynamic Networks Using Limited Resources

Ahmad Mahmoody; Matteo Riondato; Eli Upfal

Detecting new information and events in a dynamic network by probing individual nodes has many practical applications: discovering new webpages, analyzing influence properties in network, and detecting failure propagation in electronic circuits or infections in public drinkable water systems. In practice, it is infeasible for anyone but the owner of the network (if existent) to monitor all nodes at all times. In this work we study the constrained setting when the observer can only probe a small set of nodes at each time step to check whether new pieces of information (items) have reached those nodes. We formally define the problem through an infinite time generating process that places new items in subsets of nodes according to an unknown probability distribution. Items have an exponentially decaying novelty, modeling their decreasing value. The observer uses a probing schedule (i.e., a probability distribution over the set of nodes) to choose, at each time step, a small set of nodes to check for new items. The goal is to compute a schedule that minimizes the average novelty of undetected items. We present an algorithm, WIGGINS, to compute the optimal schedule through convex optimization, and then show how it can be adapted when the parameters of the problem must be learned or change over time. We also present a scalable variant of WIGGINS for the MapReduce framework. The results of our experimental evaluation on real social networks demonstrate the practicality of our approach.

advances in social networks analysis and mining | 2017

Real-Time Targeted-Influence Queries over Large Graphs

Alessandro Epasto; Ahmad Mahmoody; Eli Upfal

Social networks are important communication and information media. Individuals in a social network share information and influence each other through their social connections. Understanding social influence and information diffusion is a fundamental research endeavor and it has important applications in online social advertising and viral marketing. In this work, we introduce the Targeted-Influence problem (TIP): Given a network G = (V, ε) and a model of influence, we want to be able to estimate in real-time (e.g. a few seconds per query) the influence of a subset of users S over another subset of users T, for any possible query (S; T), S, T ⊆ V. To do so, we allow an efficient preprocessing. We provide the first scalable real-time algorithm for TIP. Our algorithm requires Õ(|V| + |ε|) space and preprocessing time, and it provides a provable approximation of the influence of S over T, for every subsets of nodes S, T ⊆ V in the query with large enough influence. The running time for answering each query (a.k.a query stage) is theoretically guaranteed to be Õ(|S| + |T|) in general undirected and for directed graphs under certain assumptions, supported by experiments. We also introduce the Snapshot model as our model of influence, which extends and includes as special case both the Independent Cascade and the Linear Threshold models. The analysis and the theoretical guarantees of our algorithms hold under the more general Snapshot model. Finally, we perform an extensive experimental analysis, demonstrating the accuracy, efficiency, and scalability of our methods.

conference on combinatorial optimization and applications | 2015

Optimizing Static and Adaptive Probing Schedules for Rapid Event Detection

Ahmad Mahmoody; Evgenios M. Kornaropoulos; Eli Upfal

We formulate and study a fundamental search and detection problem, Schedule Optimization, motivated by a variety of real-world applications, ranging from monitoring content changes on the web, social networks, and user activities to detecting failure on large systems with many individual machines. We consider a large system consists of many nodes, where each node has its own rate of generating new events, or items. A monitoring application can probe a small number of nodes at each step, and our goal is to compute a probing schedule that minimizes the expected number of undiscovered items at the system, or equivalently, minimizes the expected time to discover a new item in the system. We study the Schedule Optimization problem both for deterministic and randomized memoryless algorithms. We provide lower bounds on the cost of an optimal schedule and construct close to optimal schedules with rigorous mathematical guarantees. Finally, we present an adaptive algorithm that starts with no prior information on the system and converges to the optimal memoryless algorithms by adapting to observed data.

Linear Algebra and its Applications | 2009