Aika Terada | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aika Terada is active.

Explore More

Publication

Featured researches published by Aika Terada.

Proceedings of the National Academy of Sciences of the United States of America | 2013

Statistical significance of combinatorial regulations

Aika Terada; Mariko Okada-Hatakeyama; Koji Tsuda; Jun Sese

More than three transcription factors often work together to enable cells to respond to various signals. The detection of combinatorial regulation by multiple transcription factors, however, is not only computationally nontrivial but also extremely unlikely because of multiple testing correction. The exponential growth in the number of tests forces us to set a strict limit on the maximum arity. Here, we propose an efficient branch-and-bound algorithm called the “limitless arity multiple-testing procedure” (LAMP) to count the exact number of testable combinations and calibrate the Bonferroni factor to the smallest possible value. LAMP lists significant combinations without any limit, whereas the family-wise error rate is rigorously controlled under the threshold. In the human breast cancer transcriptome, LAMP discovered statistically significant combinations of as many as eight binding motifs. This method may contribute to uncover pathways regulated in a coordinated fashion and find hidden associations in heterogeneous data.

Molecular Ecology | 2017

Plant adaptive radiation mediated by polyploid plasticity in transcriptomes.

Rie Shimizu-Inatsugi; Aika Terada; Kyosuke Hirose; Hiroshi Kudoh; Jun Sese; Kentaro K. Shimizu

The habitats of polyploid species are generally distinct from their parental species. Stebbins described polyploids as ‘general purpose genotypes’, which can tolerate a wide range of environmental conditions. However, little is known about its molecular basis because of the complexity of polyploid genomes. We hypothesized that allopolyploid species might utilize the expression patterns of both parents depending on environments (polyploid plasticity hypothesis). We focused on hydrological niche segregation along fine‐scale soil moisture and waterlogging gradients. Two diploid species, Cardamine amara and Cardamine hirsuta, grew best in submerged and unsubmerged conditions, respectively, consistent with their natural habitats. Interestingly, the allotetraploid Cardamine flexuosa derived from them grew similarly in fluctuating as well as submerged and unsubmerged conditions, consistent with its wide environmental tolerance. A similar pattern was found in another species trio: allotetraploid Cardamine scutata and its parents. Using the close relatedness of Cardamine and Arabidopsis, we quantified genomewide expression patterns following dry and wet treatments using an Arabidopsis microarray. Hierarchical clustering analysis revealed that the expression pattern of C. flexuosa clustered with C. hirsuta in the dry condition and with C. amara in the wet condition, supporting our hypothesis. Furthermore, the induction levels of most genes in the allopolyploid were lower than in a specialist diploid species. This reflects a disadvantage of being allopolyploid arising from fixed heterozygosity. We propose that recurrent allopolyploid speciation along soil moisture and waterlogging gradients confers niche differentiation and reproductive isolation simultaneously and serves as a model for studying the molecular basis of ecological speciation and adaptive radiation.

bioinformatics and biomedicine | 2013

Fast Westfall-Young permutation procedure for combinatorial regulation discovery

Aika Terada; Koji Tsuda; Jun Sese

Three or more transcription factors (TFs) often work together, and the combinatorial regulations are essential in cellular machinery. However, it is impossible to discover statistically significant sets of TF binding motifs due to the necessity of the multiple testing procedure. To improve the sensitivity of widely used Bonferroni correction or its modified methods, such as Holm procedure, Westfall-Young permutation procedure (WY-procedure) has often been applied. However, few studies have used WY-procedure for the discoveries of the combinatorial effects of the motifs because of the extremely large computational time. In this paper, we propose an efficient branch-and-bound algorithm to perform WY-procedure to enumerate statistically significant motif combinations. When we use WY-procedure for the combinatorial regulation discovery, finding the minimum P-value from each permuted dataset consumes an enormous amount of time. We show that a combination that has the possibility to achieve the minimum P-value appears with high frequency over the threshold in dataset. This property enables a frequent itemset mining algorithm to efficiently select the candidates to achieve the minimum P-value. Our demonstrations using yeast and human transcriptome datasets show that the proposed algorithm is orders-of-magnitude faster than WY-procedure, and can practically list statistically significant motif combinations even when any combinations are considered.

2012 16th International Conference on Information Visualisation | 2012

Integrated Visualization of Gene Network and Ontology Applying a Hierarchical Graph Visualization Technique

Rina Nakazawa; Takayuki Itoh; Jun Sese; Aika Terada

A gene network is constructed with genes as nodes, and interactions between genes as edges so as to reveal unknown gene functions and relationship. However, nodes and edges of gene networks are usually very numerous. Because of that, it may be difficult to understand relations between genomic functions and gene-gene interactions, if it is visualized by traditional techniques. This paper presents our technique on visualization of gene networks and gene ontology(GO), which summarizes gene functions and attributes. The technique represents the functions defined by GO as colors of nodes, and bundles edges depending on the gene functions to ease visual complication of the network.

pacific-asia conference on knowledge discovery and data mining | 2016

Significant Pattern Mining with Confounding Variables

Aika Terada; David A. duVerle; Koji Tsuda

Recent pattern mining algorithms such as LAMP allow us to compute statistical significance of patterns with respect to an outcome variable. Their p-values are adjusted to control the family-wise error rate, which is the probability of at least one false discovery occurring. However, they are a poor fit for medical applications, due to their inability to handle potential confounding variables such as age or gender. We propose a novel pattern mining algorithm that evaluates statistical significance under confounding variables. Using a new testability bound based on the exact logistic regression model, the algorithm can exclude a large quantity of combination without testing them, limiting the amount of correction required for multiple testing. Using synthetic data, we showed that our method could remove the bias introduced by confounding variables while still detecting true patterns correlated with the class. In addition, we demonstrated application of data integration using a confounding variable.

Bioinformatics | 2016

LAMPLINK: detection of statistically significant SNP combinations from GWAS data

Aika Terada; Ryo Yamada; Koji Tsuda; Jun Sese

Summary: One of the major issues in genome-wide association studies is to solve the missing heritability problem. While considering epistatic interactions among multiple SNPs may contribute to solving this problem, existing software cannot detect statistically significant high-order interactions. We propose software named LAMPLINK, which employs a cutting-edge method to enumerate statistically significant SNP combinations from genome-wide case–control data. LAMPLINK is implemented as a set of additional functions to PLINK, and hence existing procedures with PLINK can be applicable. Applied to the 1000 Genomes Project data, LAMPLINK detected a combination of five SNPs that are statistically significantly accumulated in the Japanese population. Availability and Implementation: LAMPLINK is available at http://a-terada.github.io/lamplink/. Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

international conference on bioinformatics | 2015

High-speed westfall-young permutation procedure for genome-wide association studies

Aika Terada; Hanyoung Kim; Jun Sese

Genome-wide association studies (GWASs) are widely used to investigate statistically significant associations between diseases and single nucleotide polymorphisms (SNPs) to identify causal factors of diseases. In GWAS, statistical significance of more than one million SNPs have been recently assessed, but in many case, no associations are found because of the application of conservative multiple testing corrections, such as Bonferroni correction. While more sensitive methods, such as Westfall-Young permutation procedure (WY), would relate more SNPs with diseases, its extremely long computational time has prohibited from the application of WY to GWAS. We introduce an algorithm to accelerate WY, named High-speed Westfall-Young permutation procedure (HWY). HWY utilizes three techniques to make WY computationally practical. First, P-value calculations for SNPs that cannot affect the adjusted significance level are pruned. Second, a lookup table of P-values is used to avoid frequent duplicate calculations. Finally, computations are parallelized using a GPGPU. HWY was 619 times faster than WY and more than 122 times faster than PLINK, a widely used GWAS software, and analyzed a dataset contained one million SNPs and one thousand individuals in approximately two hours. Re-analysis of existing GWAS datasets with HWY may uncover additional hidden SNP-trait associations.

BMC Medical Genomics | 2018

Identifying statistically significant combinatorial markers for survival analysis

Raissa Relator; Aika Terada; Jun Sese

BackgroundSurvival analysis methods have been widely applied in different areas of health and medicine, spanning over varying events of interest and target diseases. They can be utilized to provide relationships between the survival time of individuals and factors of interest, rendering them useful in searching for biomarkers in diseases such as cancer. However, some disease progression can be very unpredictable because the conventional approaches have failed to consider multiple-marker interactions. An exponential increase in the number of candidate markers requires large correction factor in the multiple-testing correction and hide the significance.MethodsWe address the issue of testing marker combinations that affect survival by adapting the recently developed Limitless Arity Multiple-testing Procedure (LAMP), a p-value correction technique for statistical tests for combination of markers. LAMP cannot handle survival data statistics, and hence we extended LAMP for the log-rank test, making it more appropriate for clinical data, with newly introduced theoretical lower bound of the p-value.ResultsWe applied the proposed method to gene combination detection for cancer and obtained gene interactions with statistically significant log-rank p-values. Gene combinations with orders of up to 32 genes were detected by our algorithm, and effects of some genes in these combinations are also supported by existing literature.ConclusionThe novel approach for detecting prognostic markers presented here can identify statistically significant markers with no limitations on the order of interaction. Furthermore, it can be applied to different types of genomic data, provided that binarization is possible.

Archive | 2018

Multiple Testing Tool to Detect Combinatorial Effects in Biology

Aika Terada; Koji Tsuda

Detecting combinatorial effects is important to various research areas, including biology, genomics, and medical sciences. However, this task was not only computationally nontrivial but also extremely difficult to achieve because of the necessity of a multiple testing procedure; hence few methods can comprehensively analyze high-order combinations. Recently, Limitless Arity Multiple-testing Procedure (LAMP) was introduced, allowing us to enumerate statistically significant combinations from a given dataset. This chapter provides instructions for LAMP using simple examples of combinatorial transcription factor regulation discovery and visualization of the results. This chapter also introduces LAMPLINK, which is extended software of LAMP. LAMPLINK can handle genetic dataset to detect statistically significant interactions among multiple SNPs from a genome-wide association study (GWAS) dataset.

Bioinformatics | 2018

MP-LAMP: parallel detection of statistically significant multi-loci markers on cloud platforms

Kazuki Yoshizoe; Aika Terada; Koji Tsuda

Abstract Summary Exhaustive detection of multi-loci markers from genome-wide association study datasets is a computationally challenging problem. This paper presents a massively parallel algorithm for finding all significant combinations of alleles and introduces a software tool termed MP-LAMP that can be easily deployed in a cloud platform, such as Amazon Web Service, as well as in an in-house computer cluster. Multi-loci marker detection is an unbalanced tree search problem that cannot be parallelized by simple tree-splitting using generic parallel programming frameworks, such as Map-Reduce. We employ work stealing and periodic reduce-broadcast to decrease the running time almost linearly to the number of cores. Availability and implementation MP-LAMP is available at https://github.com/tsudalab/mp-lamp. Supplementary information Supplementary data are available at Bioinformatics online.

Explore More