Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jaroslaw Zola is active.

Publication


Featured researches published by Jaroslaw Zola.


Plant Journal | 2011

A brassinosteroid transcriptional network revealed by genome‐wide identification of BESI target genes in Arabidopsis thaliana

Xiaofei Yu; Lei Li; Jaroslaw Zola; Maneesha Aluru; Huaxun Ye; Andrew Foudree; Hongqing Guo; Sarah Anderson; Srinivas Aluru; Peng Liu; Steve Rodermel; Yanhai Yin

Brassinosteroids (BRs) are important regulators for plant growth and development. BRs signal to control the activities of the BES1 and BZR1 family transcription factors. The transcriptional network through which BES1 and BZR regulate large number of target genes is mostly unknown. By combining chromatin immunoprecipitation coupled with Arabidopsis tiling arrays (ChIP-chip) and gene expression studies, we have identified 1609 putative BES1 target genes, 404 of which are regulated by BRs and/or in gain-of-function bes1-D mutant. BES1 targets contribute to BR responses and interactions with other hormonal or light signaling pathways. Computational modeling of gene expression data using Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) reveals that BES1-targeted transcriptional factors form a gene regulatory network (GRN). Mutants of many genes in the network displayed defects in BR responses. Moreover, we found that BES1 functions to inhibit chloroplast development by repressing the expression of GLK1 and GLK2 transcription factors, confirming a hypothesis generated from the GRN. Our results thus provide a global view of BR regulated gene expression and a GRN that guides future studies in understanding BR-regulated plant growth.


conference on high performance computing (supercomputing) | 2007

Large-scale maximum likelihood-based phylogenetic analysis on the IBM BlueGene/L

Michael Ott; Jaroslaw Zola; Alexandros Stamatakis; Srinivas Aluru

Phylogenetic inference is a grand challenge in Bioinformatics due to immense computational requirements. The increasing popularity of multi-gene alignments in biological studies, which typically provide a stable topological signal due to a more favorable ratio of the number of base pairs to the number of sequences, coupled with rapid accumulation of sequence data in general, poses new challenges for high performance computing. In this paper, we demonstrate how state-of-the-art Maximum Likelihood (ML) programs can be efficiently scaled to the IBM BlueGene/L (BG/L) architecture, by porting RAxML, which is currently among the fastest and most accurate programs for phylogenetic inference under the ML criterion. We simultaneously exploit coarse-grained and fine-grained parallelism that is inherent in every ML-based biological analysis. Performance is assessed using datasets consisting of 212 sequences and 566,470 base pairs, and 2,182 sequences and 51,089 base pairs, respectively. To the best of our knowledge, these are the largest datasets analyzed under ML to date. The capability to analyze such datasets will help to address novel biological questions via phylogenetic analyses. Our experimental results indicate that the fine-grained parallelization scales well up to 1, 024 processors. Moreover, a larger number of processors can be efficiently exploited by a combination of coarse-grained and fine-grained parallelism. Finally, we demonstrate that our parallelization scales equally well on an AMD Opteron cluster with a less favorable network latency to processor speed ratio. We recorded super-linear speedups in several cases due to increased cache efficiency.


Plant Physiology | 2009

Chloroplast Photooxidation-Induced Transcriptome Reprogramming in Arabidopsis immutans White Leaf Sectors

Maneesha Aluru; Jaroslaw Zola; Andrew Foudree; Steven R. Rodermel

Arabidopsis (Arabidopsis thaliana) immutans (im) has green and white sectoring due to the action of a nuclear recessive gene, IMMUTANS. The green sectors contain normal-appearing chloroplasts, whereas the white sectors contain abnormal chloroplasts that lack colored carotenoids due to a defect in phytoene desaturase activity. Previous biochemical and molecular characterizations of the green leaf sectors revealed alterations suggestive of a source-sink relationship between the green and white sectors of im. In this study, we use an Affymetrix ATH1 oligoarray to further explore the nature of sink metabolism in im white tissues. We show that lack of colored carotenoids in the im white tissues elicits a differential response from a large number of genes involved in various cellular processes and stress responses. Gene expression patterns correlate with the repression of photosynthesis and photosynthesis-related processes in im white tissues, with an induction of Suc catabolism and transport, and with mitochondrial electron transport and fermentation. These results suggest that energy is derived via aerobic and anaerobic metabolism of imported sugar in im white tissues for growth and development. We also show that oxidative stress responses are largely induced in im white tissues; however, im green sectors develop additional energy-dissipating mechanisms that perhaps allow for the formation of green sectors. Furthermore, a comparison of the transcriptomes of im white and norflurazon-treated white leaf tissues reveals global as well as tissue-specific responses to photooxidation. We conclude that the differences in the mechanism of phytoene desaturase inhibition play an important role in differentiating these two white tissues.


Nucleic Acids Research | 2013

Reverse engineering and analysis of large genome-scale gene networks

Maneesha Aluru; Jaroslaw Zola; Dan Nettleton; Srinivas Aluru

Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web.


IEEE Transactions on Parallel and Distributed Systems | 2010

Parallel Information-Theory-Based Construction of Genome-Wide Gene Regulatory Networks

Jaroslaw Zola; Maneesha Aluru; Abhinav Sarje; Srinivas Aluru

Constructing genome-wide gene regulatory networks from large-scale gene expression data is an important problem in systems biology. While several techniques have been developed, none of them is parallel, and they do not scale to the whole genome level or incorporate the largest data sets, particularly with rigorous statistical techniques. In this paper, we present a parallel method integrating mutual information, data processing inequality, and statistical testing to detect significant dependencies between genes, and efficiently exploit parallelism inherent in such computations. We present a new method to carry out permutation testing for assessing statistical significance of interactions, while reducing its computational complexity by a factor of Θ(n2), where n is the number of genes. Using both synthetic and known regulatory networks, we show that our method produces networks of quality similar to ARACNe, a widely used mutual-information-based method. We further explore the use of accelerators for gene network construction by presenting a parallelization on a cluster of IBM Cell blades. We exploit parallelization across multiple Cells, multiple cores within each Cell, and vector units within the cores to develop a high-performance implementation that effectively addresses the scaling problem. We report the first inference of a plant whole genome network by constructing a 15,222 gene network of the plant Arabidopsis thaliana from 3,137 microarray experiments in 30 minutes on a 2,048-CPU IBM Blue Gene/L, and in 2 hours and 25 minutes on a 8-node Cell blade cluster.


international parallel and distributed processing symposium | 2011

Parallel Metagenomic Sequence Clustering Via Sketching and Maximal Quasi-clique Enumeration on Map-Reduce Clouds

Xiao Yang; Jaroslaw Zola; Srinivas Aluru

Taxonomic clustering of species is an important and frequently arising problem in metagenomics. High-throughput next generation sequencing is facilitating the creation of large metagenomic samples, while at the same time making the clustering problem harder due to the short sequence length supported and unknown species sampled. In this paper, we present a parallel algorithm for hierarchical taxonomic clustering of large metagenomic samples with support for overlapping clusters. We adapt the sketching techniques originally developed for web document clustering to deduce significant similarities between pairs of sequences without resorting to expensive all vs. all alignments. We formulate the metagenomics classification problem as that of maximal quasi-clique enumeration in the resulting similarity graph, at multiple levels of the hierarchy as prescribed by different similarity thresholds. We cast execution of the underlying algorithmic steps as applications of the map-reduce framework to achieve a cloud based implementation. Apart from solving an important problem in metagenomics, this work demonstrates the applicability of map-reduce framework in relatively complicated algorithmic settings.


european conference on parallel processing | 2004

Cache-based parallelization of multiple sequence alignment problem

Gilles Parmentier; Denis Trystram; Jaroslaw Zola

In this paper we present new approach to the problem of parallel multiple sequence alignment. The proposed method is based on the application of caching technique and is aimed to solve, with high precision, large alignment instances on the heterogeneous clusters. The cache is used to store partial alignment guiding trees which can be reused in future computations, and is applied to eliminate redundancy of computations in parallel environment. We describe an implementation based on the CaLi library, the software designed for caches implementation. We report preliminary experimental results and finally, we propose some extensions of our method.


Journal of Parallel and Distributed Computing | 2013

Parallel globally optimal structure learning of Bayesian networks

Olga Nikolova; Jaroslaw Zola; Srinivas Aluru

Given n random variables and a set of m observations of each of the n variables, the Bayesian network structure learning problem is to learn a directed acyclic graph (DAG) on the n variables such that the implied joint probability distribution best explains the set of observations. Bayesian networks are widely used in many fields including data mining and computational biology. Globally optimal (exact) structure learning of Bayesian networks takes O(n^2@?2^n) time plus the cost of O(n@?2^n) evaluations of an application-specific scoring function whose run-time is at least linear in m. In this paper, we present a parallel algorithm for exact structure learning of a Bayesian network that is communication-efficient and work-optimal up to O(1n@?2^n) processors. We further extend this algorithm to the important restricted case of structure learning with bounded node in-degree and investigate the performance gains achievable because of limiting node in-degree. We demonstrate the applicability of our method by implementation on an IBM Blue Gene/P system and an AMD Opteron InfiniBand cluster and present experimental results that characterize run-time behavior with respect to the number of variables, number of observations, and the bound on in-degree.


ieee international conference on high performance computing, data, and analytics | 2008

Parallel information theory based construction of gene regulatory networks

Jaroslaw Zola; Maneesha Aluru; Srinivas Aluru

We present a parallel method for construction of gene regulatorynetworks from large-scale gene expression data. Our method integratesmutual information, data processing inequality and statisticaltesting to detect significant dependencies between genes, and efficientlyexploits parallelism inherent in such computations. We present a novelmethod to carry out permutation testing for assessing statistical significancewhile reducing its computational complexity by a factor of Θ(n2),where n is the number of genes. Using both synthetic and known regulatorynetworks, we show that our method produces networks of qualitysimilar to ARACNE, a widely used mutual information based method.We present a parallelization of the algorithm that, for the first time, allowsconstruction of whole genome networks from thousands of microarrayexperiments using rigorous mutual information based methodology.We report the construction of a 15,147 gene network of the plant Arabidopsisthaliana from 2,996 microarray experiments on a 2,048-CPUBlue Gene/L in 45 minutes, thus addressing a grand challenge problemin the NSF Arabidopsis 2010 initiative.


international parallel and distributed processing symposium | 2006

Parallel multiple sequence alignment with local phylogeny search by simulated annealing

Jaroslaw Zola; Denis Trystram; Andrei Tchernykh; Carlos A. Brizuela

The problem of multiple sequence alignment is one of the most important problems in computational biology. In this paper we present a new method that simultaneously performs multiple sequence alignment and phylogenetic tree inference for large input data sets. We describe a parallel implementation of our method that utilises simulated annealing metaheuristic to find locally optimal phylogenetic trees in reasonable time. To validate the method, we perform a set of experiments with synthetic as well as real-life data

Collaboration


Dive into the Jaroslaw Zola's collaboration.

Top Co-Authors

Avatar

Srinivas Aluru

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Abhinav Sarje

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Denis Trystram

Institut Universitaire de France

View shared research outputs
Top Co-Authors

Avatar

Olga Wodo

University at Buffalo

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge