Mahbubul Majumder
Iowa State University
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Mahbubul Majumder.
Journal of the American Statistical Association | 2013
Mahbubul Majumder; Heike Hofmann; Dianne Cook
Statistical graphics play a crucial role in exploratory data analysis, model checking, and diagnosis. The lineup protocol enables statistical significance testing of visual findings, bridging the gulf between exploratory and inferential statistics. In this article, inferential methods for statistical graphics are developed further by refining the terminology of visual inference and framing the lineup protocol in a context that allows direct comparison with conventional tests in scenarios when a conventional test exists. This framework is used to compare the performance of the lineup protocol against conventional statistical testing in the scenario of fitting linear models. A human subjects experiment is conducted using simulated data to provide controlled conditions. Results suggest that the lineup protocol performs comparably with the conventional tests, and expectedly outperforms them when data are contaminated, a scenario where assumptions required for performing a conventional test are violated. Surprisingly, visual tests have higher power than the conventional tests when the effect size is large. And, interestingly, there may be some super-visual individuals who yield better performance and power than the conventional test even in the most difficult tasks. Supplementary materials for this article are available online.
IEEE Transactions on Visualization and Computer Graphics | 2012
Heike Hofmann; Lendie Follett; Mahbubul Majumder; Dianne Cook
Lineups [4, 28] have been established as tools for visual testing similar to standard statistical inference tests, allowing us to evaluate the validity of graphical findings in an objective manner. In simulation studies [12] lineups have been shown as being efficient: the power of visual tests is comparable to classical tests while being much less stringent in terms of distributional assumptions made. This makes lineups versatile, yet powerful, tools in situations where conditions for regular statistical tests are not or cannot be met. In this paper we introduce lineups as a tool for evaluating the power of competing graphical designs. We highlight some of the theoretical properties and then show results from two studies evaluating competing designs: both studies are designed to go to the limits of our perceptual abilities to highlight differences between designs. We use both accuracy and speed of evaluation as measures of a successful design. The first study compares the choice of coordinate system: polar versus cartesian coordinates. The results show strong support in favor of cartesian coordinates in finding fast and accurate answers to spotting patterns. The second study is aimed at finding shift differences between distributions. Both studies are motivated by data problems that we have recently encountered, and explore using simulated data to evaluate the plot designs under controlled conditions. Amazon Mechanical Turk (MTurk) is used to conduct the studies. The lineups provide an effective mechanism for objectively evaluating plot designs.
Plant Cell and Environment | 2014
Sarah Elizabeth Atwood; Jamie A. O'rourke; Gregory A. Peiffer; Tengfei Yin; Mahbubul Majumder; Chunquan Zhang; Silvia R. Cianzio; John H. Hill; Dianne Cook; Steven A. Whitham; Randy C. Shoemaker; Michelle A. Graham
In soybean [Glycine max (L.) Merr.], iron deficiency results in interveinal chlorosis and decreased photosynthetic capacity, leading to stunting and yield loss. In this study, gene expression analyses investigated the role of soybean replication protein A (RPA) subunits during iron stress. Nine RPA homologs were significantly differentially expressed in response to iron stress in the near isogenic lines (NILs) Clark (iron efficient) and Isoclark (iron inefficient). RPA homologs exhibited opposing expression patterns in the two NILs, with RPA expression significantly repressed during iron deficiency in Clark but induced in Isoclark. We used virus induced gene silencing (VIGS) to repress GmRPA3 expression in the iron inefficient line Isoclark and mirror expression in Clark. GmRPA3-silenced plants had improved IDC symptoms and chlorophyll content under iron deficient conditions and also displayed stunted growth regardless of iron availability. RNA-Seq comparing gene expression between GmRPA3-silenced and empty vector plants revealed massive transcriptional reprogramming with differential expression of genes associated with defense, immunity, aging, death, protein modification, protein synthesis, photosynthesis and iron uptake and transport genes. Our findings suggest the iron efficient genotype Clark is able to induce energy controlling pathways, possibly regulated by SnRK1/TOR, to promote nutrient recycling and stress responses in iron deficient conditions.
Frontiers in Bioengineering and Biotechnology | 2016
Bhanwar Lal Puniya; Laura Allen; Colleen Hochfelder; Mahbubul Majumder; Tomáš Helikar
Dysregulation in signal transduction pathways can lead to a variety of complex disorders, including cancer. Computational approaches such as network analysis are important tools to understand system dynamics as well as to identify critical components that could be further explored as therapeutic targets. Here, we performed perturbation analysis of a large-scale signal transduction model in extracellular environments that stimulate cell death, growth, motility, and quiescence. Each of the model’s components was perturbed under both loss-of-function and gain-of-function mutations. Using 1,300 simulations under both types of perturbations across various extracellular conditions, we identified the most and least influential components based on the magnitude of their influence on the rest of the system. Based on the premise that the most influential components might serve as better drug targets, we characterized them for biological functions, housekeeping genes, essential genes, and druggable proteins. The most influential components under all environmental conditions were enriched with several biological processes. The inositol pathway was found as most influential under inactivating perturbations, whereas the kinase and small lung cancer pathways were identified as the most influential under activating perturbations. The most influential components were enriched with essential genes and druggable proteins. Moreover, known cancer drug targets were also classified in influential components based on the affected components in the network. Additionally, the systemic perturbation analysis of the model revealed a network motif of most influential components which affect each other. Furthermore, our analysis predicted novel combinations of cancer drug targets with various effects on other most influential components. We found that the combinatorial perturbation consisting of PI3K inactivation and overactivation of IP3R1 can lead to increased activity levels of apoptosis-related components and tumor-suppressor genes, suggesting that this combinatorial perturbation may lead to a better target for decreasing cell proliferation and inducing apoptosis. Finally, our approach shows a potential to identify and prioritize therapeutic targets through systemic perturbation analysis of large-scale computational models of signal transduction. Although some components of the presented computational results have been validated against independent gene expression data sets, more laboratory experiments are warranted to more comprehensively validate the presented results.
International Journal of Intelligent Technologies and Applied Statistics | 2013
Yifan Zhao; Dianne Cook; Heike Hofmann; Mahbubul Majumder; Niladri Roy Chowdhury
Graphical statistics plays a very important job in research and industry. As a statistician, it is very useful that if one can see how people make their decision based on the plots, so that plots can be improved for better performance. With the help of eye-tracking equipment, researchers could show people several plots and ask a question on each, then track the person’s eyes to see how they going through the plots to come to their answer. In this paper, the process and results of an experiment on watching what people were looking at in statistical plots will be discussed. This experiment is part of a larger experiment studying decision making and signal strength in statistical graphics, that uses Amazon’s Mechanical Turk.
Journal of Data Mining in Genomics & Proteomics | 2013
Tengfei Yin; Mahbubul Majumder; Niladri Roy Chowdhury; Dianne Cook; y Shoemaker; Michelle A. Graham
In an analysis of RNA-Seq data from soybeans, initial significance testing using one software package produced very different gene lists from those yielded by another. How can this happen? This paper demonstrates how the disparities between the results were investigated, and can be explained. This type of contradiction can occur more generally in high-throughput analyses. To explore the model fitting and hypothesis testing, we implemented an interactive graphic that allows the exploration of the effect of dispersion estimation on the overall estimation of variance and differential expression tests. In addition, we propose a new procedure to test for the presence of any structure in biological data.
Journal of Computational and Graphical Statistics | 2018
Niladri Roy Chowdhury; Dianne Cook; Heike Hofmann; Mahbubul Majumder
ABSTRACT Graphics play a crucial role in statistical analysis and data mining. Being able to quantify structure in data that is visible in plots, and how people read the structure from plots is an ongoing challenge. The lineup protocol provides a formal framework for data plots, making inference possible. The data plot is treated like a test statistic, and lineup protocol acts like a comparison with the sampling distribution of the nulls. This article describes metrics for describing structure in data plots and evaluates them in relation to the choices that human readers made during several large Amazon Turk studies using lineups. The metrics that were more specific to the plot types tended to better match subject choices, than generic metrics. The process that we followed to evaluate metrics will be useful for general development of numerically measuring structure in plots, and also in future experiments on lineups for choosing blocks of pictures. Supplementary materials for this article are available online.
Journal of Computational and Graphical Statistics | 2017
Mahbubul Majumder; Xiaoyue Cheng
ABSTRACT Donoho’s article “50 Years of Data Science” is a well-thought explanation of a newly developed discipline called “data science.” In this article, we examine his explanations and suggestions about data science, follow-up on some of the issues he mentioned, and share our experiences in developing a data science curriculum and the teaching of related courses.
Computational Statistics | 2015
Niladri Roy Chowdhury; Dianne Cook; Heike Hofmann; Mahbubul Majumder; Eun-Kyung Lee; Amy L. Toth
Annual Review of Statistics and Its Application | 2016
Dianne Cook; Eun-Kyung Lee; Mahbubul Majumder
