Atif Rahman
Bangladesh University of Engineering and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Atif Rahman.
Genome Biology | 2013
Atif Rahman; Lior Pachter
Assembly algorithms have been extensively benchmarked using simulated data so that results can be compared to ground truth. However, in de novo assembly, only crude metrics such as contig number and size are typically used to evaluate assembly quality. We present CGAL, a novel likelihood-based approach to assembly assessment in the absence of a ground truth. We show that likelihood is more accurate than other metrics currently used for evaluating assemblies, and describe its application to the optimization and comparison of assembly algorithms. Our methods are implemented in software that is freely available at http://bio.math.berkeley.edu/cgal/.
Electronic Notes in Discrete Mathematics | 2010
Mahfuza Sharmin; Rukhsana Yeasmin; Masud Hasan; Atif Rahman; M. Sohel Rahman
In this paper, we give approximation algorithms for several variations of the pancake flipping problem, which is also well known as the problem of sorting by prefix reversals. We consider the variations in the sorting process by adding prefix transpositions, prefix transreversals etc. along with the prefix reversals.
Journal of Discrete Algorithms | 2015
Masud Hasan; Atif Rahman; Md. Khaledur Rahman; M. Sohel Rahman; Mahfuza Sharmin; Rukhsana Yeasmin
In this paper, we study several variations of the pancake flipping problem, which is also well known as the problem of sorting by prefix reversals. We consider the variations in the sorting process by adding with prefix reversals other similar operations such as prefix transpositions and prefix transreversals. We first study the problem of sorting unsigned permutations by prefix reversals and prefix transpositions and present a 3-approximation algorithm for this problem. Then we give a 2-approximation algorithm for sorting by prefix reversals and prefix transreversals. We also provide a 3-approximation algorithm for sorting by prefix reversals and prefix transpositions where the operations are always applied at the unsorted suffix of the permutation. We further analyze the problem from practical point of view and show quantitatively how approximation ratios of our algorithms improve with the increase of number of prefix reversals applied by optimal algorithms. Finally, we present experimental results to support our analysis.
workshop on algorithms and computation | 2018
Suri Dipannita Sayeed; M. Sohel Rahman; Atif Rahman
Motif finding is the problem of identifying recurring patterns in sequences. It has been widely studied and several variants have been proposed. Here, we address the problem of finding common motifs with gaps that are present in all strings of a finite set. We prove that the problem is NP-hard by reducing the multiple longest common subsequence (MLCS) problem to it. We also provide a branch and bound algorithm for MLCS and show how the algorithm can be extended to give an algorithm for finding common motifs with gaps after common factors that occur in all the strings have been identified.
eLife | 2018
Atif Rahman; Ingileif Hallgrimsdottir; Michael B. Eisen; Lior Pachter
Genome wide association studies (GWAS) rely on microarrays, or more recently mapping of sequencing reads, to genotype individuals. The reliance on prior sequencing of a reference genome limits the scope of association studies, and also precludes mapping associations outside of the reference. We present an alignment free method for association studies of categorical phenotypes based on counting k-mers in whole-genome sequencing reads, testing for associations directly between k-mers and the trait of interest, and local assembly of the statistically significant k-mers to identify sequence differences. An analysis of the 1000 genomes data show that sequences identified by our method largely agree with results obtained using the standard approach. However, unlike standard GWAS, our method identifies associations with structural variations and sites not present in the reference genome. We also demonstrate that population stratification can be inferred from k-mers. Finally, application to an E.coli dataset on ampicillin resistance validates the approach.
bioRxiv | 2018
Muhammad Ali Nayeem; Md. Shamsuzzoha Bayzid; Atif Rahman; Rifat Shahriyar; M. Sohel Rahman
Multiple sequence alignment (MSA) is a basic step in many analyses in computational biology, including predicting the structure and function of proteins, orthology prediction and estimating phylogenies. The objective of MSA is to infer the homology among the sequences of chosen species. Commonly, the MSAs are inferred by optimizing a single function or objective. The alignments estimated under one criterion may be different to the alignments generated by other criteria, inferring discordant homologies and thus leading to different evolutionary histories relating the sequences. In recent past, researchers have advocated for the multi-objective formulation of MSA, to address this issue, where multiple conflicting objective functions are being optimized simultaneously to generate a set of alignments. However, no theoretical or empirical justification with respect to a real-life application has been shown for a particular multi-objective formulation. In this study, we investigate the impact of multi-objective formulation in the context of phylogenetic tree estimation. Employing multi-objective metaheuristics, we demonstrate that trees estimated on the alignments generated by multi-objective formulation are substantially better than the trees estimated by the state-of-the-art MSA tools, including PASTA, MUSCLE, CLUSTAL, MAFFT etc. We also demonstrate that highly accurate alignments with respect to popular measures like sum-of-pair (SP) score and total-column (TC) score do not necessarily lead to highly accurate phylogenetic trees. Thus in essence we ask the question whether a phylogeny-aware metric can guide us in choosing appropriate multi-objective formulations that can result in better phylogeny estimation. And we answer the question affirmatively through carefully designed extensive empirical study. As a by-product we also suggest a methodology for primary selection of a set of objective functions for a multi-objective formulation based on the association with the resulting phylogenetic tree.
Journal of Network and Computer Applications | 2018
Ch. Md. Rakin Haider; Anindya Iqbal; Atif Rahman; M. Sohel Rahman
Abstract Mobile advertising enjoys 51% share of the whole digital market nowadays. The advertising ecosystem faces a major threat from ad frauds caused by false display requests or clicks, generated by malicious codes, bot-nets, click-firms etc. Around 30% revenue is being wasted due to frauds. Ad frauds in web advertising have been studied extensively, however frauds in mobile advertising have received little attention. Studies have been conducted to detect fraudulent clicks in web and mobile advertisement. However, detection of individual fraudulent display in mobile advertising is yet to be explored to the best of our knowledge. We have proposed an ensemble based method to classify each individual ad display, also called an impression, as fraudulent or non-fraudulent. Our solution achieves as high as 99.32% accuracy, 96.29% precision and 84.75% recall using real datasets from an European commercial ad server. We have proposed some new features and analyzed their contribution using standard techniques. We have also designed a new mechanism to offer flexibility of tolerance to different ad servers in deciding whether an ad display is fraudulent or not.
bioRxiv | 2016
Atif Rahman; Lior Pachter
Scaffolding i.e. ordering and orienting contigs is an important step in genome assembly. We present a method for scaffolding based on likelihoods of genome assemblies. Generative models for sequencing are used to obtain maximum likelihood estimates of gaps between contigs and to estimate whether linking contigs into scaffolds would lead to an increase in the likelihood of the assembly. We then link contigs if they can be unambiguously joined or if the corresponding increase in likelihood is substantially greater than that of other possible joins of those contigs. The method is implemented in a tool called Swalo with approximations to make it efficient and applicable to large datasets. Analysis on real and simulated datasets reveals that it consistently makes more or similar number of correct joins as other scaffolders while linking very few contigs incorrectly, thus outperforming other scaffolders and demonstrating that substantial improvement in genome assembly may be achieved through the use of statistical models. Swalo is freely available for download at https://atifrahman.github.io/SWALO/.
Journal of Discrete Algorithms | 2008
Atif Rahman; Swakkhar Shatabda; Masud Hasan
workshop on algorithms and computation | 2007
Atif Rahman; Swakkhar Shatabda; Masud Hasan