Kaname Kojima | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kaname Kojima is active.

Explore More

Publication

Featured researches published by Kaname Kojima.

BMC Bioinformatics | 2007

An efficient grid layout algorithm for biological networks utilizing various biological attributes

Kaname Kojima; Masao Nagasaki; Euna Jeong; Mitsuru Kato; Satoru Miyano

BackgroundClearly visualized biopathways provide a great help in understanding biological systems. However, manual drawing of large-scale biopathways is time consuming. We proposed a grid layout algorithm that can handle gene-regulatory networks and signal transduction pathways by considering edge-edge crossing, node-edge crossing, distance measure between nodes, and subcellular localization information from Gene Ontology. Consequently, the layout algorithm succeeded in drastically reducing these crossings in the apoptosis model. However, for larger-scale networks, we encountered three problems: (i) the initial layout is often very far from any local optimum because nodes are initially placed at random, (ii) from a biological viewpoint, human layouts still exceed automatic layouts in understanding because except subcellular localization, it does not fully utilize biological information of pathways, and (iii) it employs a local search strategy in which the neighborhood is obtained by moving one node at each step, and automatic layouts suggest that simultaneous movements of multiple nodes are necessary for better layouts, while such extension may face worsening the time complexity.ResultsWe propose a new grid layout algorithm. To address problem (i), we devised a new force-directed algorithm whose output is suitable as the initial layout. For (ii), we considered that an appropriate alignment of nodes having the same biological attribute is one of the most important factors of the comprehension, and we defined a new score function that gives an advantage to such configurations. For solving problem (iii), we developed a search strategy that considers swapping nodes as well as moving a node, while keeping the order of the time complexity. Though a naïve implementation increases by one order, the time complexity, we solved this difficulty by devising a method that caches differences between scores of a layout and its possible updates.ConclusionLayouts of the new grid layout algorithm are compared with that of the previous algorithm and human layout in an endothelial cell model, three times as large as the apoptosis model. The total cost of the result from the new grid layout algorithm is similar to that of the human layout. In addition, its convergence time is drastically reduced (40% reduction).

Proceedings of the 9th Annual International Workshop on Bioinformatics and Systems Biology (IBSB 2009) | 2010

A STATE SPACE REPRESENTATION OF VAR MODELS WITH SPARSE LEARNING FOR DYNAMIC GENE NETWORKS

Kaname Kojima; Rui Yamaguchi; Seiya Imoto; Mai Yamauchi; Masao Nagasaki; Ryo Yoshida; Teppei Shimamura; Kazuko Ueno; Tomoyuki Higuchi; Noriko Gotoh; Satoru Miyano

We propose a state space representation of vector autoregressive model and its sparse learning based on L1 regularization to achieve efficient estimation of dynamic gene networks based on time course microarray data. The proposed method can overcome drawbacks of the vector autoregressive model and state space model; the assumption of equal time interval and lack of separation ability of observation and systems noises in the former method and the assumption of modularity of network structure in the latter method. However, in a simple implementation the proposed model requires the calculation of large inverse matrices in a large number of times during parameter estimation process based on EM algorithm. This limits the applicability of the proposed method to a relatively small gene set. We thus introduce a new calculation technique for EM algorithm that does not require the calculation of inverse matrices. The proposed method is applied to time course microarray data of lung cells treated by stimulating EGF receptors and dosing an anticancer drug, Gefitinib. By comparing the estimated network with the control network estimated using non-treated lung cells, perturbed genes by the anticancer drug could be found, whose up- and down-stream genes in the estimated networks may be related to side effects of the anticancer drug.

Journal of Bioinformatics and Computational Biology | 2010

IDENTIFICATION OF GRANGER CAUSALITY BETWEEN GENE SETS

André Fujita; João Ricardo Sato; Kaname Kojima; Luciana R. Gomes; Masao Nagasaki; Mari Cleide Sogayar; Satoru Miyano

Wiener and Granger have introduced an intuitive concept of causality (Granger causality) between two variables which is based on the idea that an effect never occurs before its cause. Later, Geweke generalized this concept to a multivariate Granger causality, i.e. n variables Granger-cause another variable. Although Granger causality is not effective causality in the Aristothelic sense, this concept is useful to infer directionality and information flow in observational data. Granger causality is usually identified by using VAR (Vector Autoregressive) models due to their simplicity. In the last few years, several VAR-based models were presented in order to model gene regulatory networks. Here, we generalize the multivariate Granger causality concept in order to identify Granger causalities between sets of gene expressions, i.e. whether a set of n genes Granger-causes another set of m genes, aiming at identifying the flow of information between gene networks (or pathways). The concept of Granger causality for sets of variables is presented. Moreover, a method for its identification with a bootstrap test is proposed. This method is applied in simulated and also in actual biological gene expression data in order to model regulatory networks. This concept may be useful for the understanding of the complete information flow from one network or pathway to the other, mainly in regulatory networks. Linking this concept to graph theory, sink and source can be generalized to node sets. Moreover, hub and centrality for sets of genes can be defined based on total information flow. Another application is in annotation, when the functionality of a set of genes is unknown, but this set is Granger-caused by another set of genes which is well studied. Therefore, this information may be useful to infer or construct some hypothesis about the unknown set of genes.

Bioinformatics | 2008

Fast grid layout algorithm for biological networks with sweep calculation

Kaname Kojima; Masao Nagasaki; Satoru Miyano

MOTIVATIONnProperly drawn biological networks are of great help in the comprehension of their characteristics. The quality of the layouts for retrieved biological networks is critical for pathway databases. However, since it is unrealistic to manually draw biological networks for every retrieval, automatic drawing algorithms are essential. Grid layout algorithms handle various biological properties such as aligning vertices having the same attributes and complicated positional constraints according to their subcellular localizations; thus, they succeed in providing biologically comprehensible layouts. However, existing grid layout algorithms are not suitable for real-time drawing, which is one of requisites for applications to pathway databases, due to their high-computational cost. In addition, they do not consider edge directions and their resulting layouts lack traceability for biochemical reactions and gene regulations, which are the most important features in biological networks.nnnRESULTSnWe devise a new calculation method termed sweep calculation and reduce the time complexity of the current grid layout algorithms through its encoding and decoding processes. We conduct practical experiments by using 95 pathway models of various sizes from TRANSPATH and show that our new grid layout algorithm is much faster than existing grid layout algorithms. For the cost function, we introduce a new component that penalizes undesirable edge directions to avoid the lack of traceability in pathways due to the differences in direction between in-edges and out-edges of each vertex.nnnAVAILABILITYnJava implementations of our layout algorithms are available in Cell [email protected] INFORMATIONnSupplementary data are available at Bioinformatics online.

BMC Bioinformatics | 2009

BFL: a node and edge betweenness based fast layout algorithm for large scale networks

Tatsunori B. Hashimoto; Masao Nagasaki; Kaname Kojima; Satoru Miyano

BackgroundNetwork visualization would serve as a useful first step for analysis. However, current graph layout algorithms for biological pathways are insensitive to biologically important information, e.g. subcellular localization, biological node and graph attributes, or/and not available for large scale networks, e.g. more than 10000 elements.ResultsTo overcome these problems, we propose the use of a biologically important graph metric, betweenness, a measure of network flow. This metric is highly correlated with many biological phenomena such as lethality and clusters. We devise a new fast parallel algorithm calculating betweenness to minimize the preprocessing cost. Using this metric, we also invent a node and edge betweenness based fast layout algorithm (BFL). BFL places the high-betweenness nodes to optimal positions and allows the low-betweenness nodes to reach suboptimal positions. Furthermore, BFL reduces the runtime by combining a sequential insertion algorim with betweenness. For a graph with n nodes, this approach reduces the expected runtime of the algorithm to O(n2) when considering edge crossings, and to O(n log n) when considering only density and edge lengths.ConclusionOur BFL algorithm is compared against fast graph layout algorithms and approaches requiring intensive optimizations. For gene networks, we show that our algorithm is faster than all layout algorithms tested while providing readability on par with intensive optimization algorithms. We achieve a 1.4 second runtime for a graph with 4000 nodes and 12000 edges on a standard desktop computer.

Bioinformatics | 2010

A fast and robust statistical test based on likelihood ratio with Bartlett correction to identify Granger causality between gene sets

André Fujita; Kaname Kojima; Alexandre G. Patriota; João Ricardo Sato; Patricia Severino; Satoru Miyano

UNLABELLEDnWe propose a likelihood ratio test (LRT) with Bartlett correction in order to identify Granger causality between sets of time series gene expression data. The performance of the proposed test is compared to a previously published bootstrap-based approach. LRT is shown to be significantly faster and statistically powerful even within non-Normal distributions. An R package named gGranger containing an implementation for both Granger causality identification tests is also provided.nnnAVAILABILITYnhttp://dnagarden.ims.u-tokyo.ac.jp/afujita/en/doku.php?id=ggranger.

BMC Systems Biology | 2012

Functional clustering of time series gene expression data by Granger causality

André Fujita; Patricia Severino; Kaname Kojima; João Ricardo Sato; Alexandre G. Patriota; Satoru Miyano

BackgroundA common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes.ResultsIn this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence.ConclusionsThis kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them.

BMC Genomics | 2012

Identifying regulational alterations in gene regulatory networks by state space representation of vector autoregressive models and variational annealing

Kaname Kojima; Seiya Imoto; Rui Yamaguchi; André Fujita; Mai Yamauchi; Noriko Gotoh; Satoru Miyano

BackgroundIn the analysis of effects by cell treatment such as drug dosing, identifying changes on gene network structures between normal and treated cells is a key task. A possible way for identifying the changes is to compare structures of networks estimated from data on normal and treated cells separately. However, this approach usually fails to estimate accurate gene networks due to the limited length of time series data and measurement noise. Thus, approaches that identify changes on regulations by using time series data on both conditions in an efficient manner are demanded.MethodsWe propose a new statistical approach that is based on the state space representation of the vector autoregressive model and estimates gene networks on two different conditions in order to identify changes on regulations between the conditions. In the mathematical model of our approach, hidden binary variables are newly introduced to indicate the presence of regulations on each condition. The use of the hidden binary variables enables an efficient data usage; data on both conditions are used for commonly existing regulations, while for condition specific regulations corresponding data are only applied. Also, the similarity of networks on two conditions is automatically considered from the design of the potential function for the hidden binary variables. For the estimation of the hidden binary variables, we derive a new variational annealing method that searches the configuration of the binary variables maximizing the marginal likelihood.ResultsFor the performance evaluation, we use time series data from two topologically similar synthetic networks, and confirm that our proposed approach estimates commonly existing regulations as well as changes on regulations with higher coverage and precision than other existing approaches in almost all the experimental settings. For a real data application, our proposed approach is applied to time series data from normal Human lung cells and Human lung cells treated by stimulating EGF-receptors and dosing an anticancer drug termed Gefitinib. In the treated lung cells, a cancer cell condition is simulated by the stimulation of EGF-receptors, but the effect would be counteracted due to the selective inhibition of EGF-receptors by Gefitinib. However, gene expression profiles are actually different between the conditions, and the genes related to the identified changes are considered as possible off-targets of Gefitinib.ConclusionsFrom the synthetically generated time series data, our proposed approach can identify changes on regulations more accurately than existing methods. By applying the proposed approach to the time series data on normal and treated Human lung cells, candidates of off-target genes of Gefitinib are found. According to the published clinical information, one of the genes can be related to a factor of interstitial pneumonia, which is known as a side effect of Gefitinib.

BMC Bioinformatics | 2010

An efficient biological pathway layout algorithm combining grid-layout and spring embedder for complicated cellular location information

Kaname Kojima; Masao Nagasaki; Satoru Miyano

BackgroundGraph drawing is one of the important techniques for understanding biological regulations in a cell or among cells at the pathway level. Among many available layout algorithms, the spring embedder algorithm is widely used not only for pathway drawing but also for circuit placement and www visualization and so on because of the harmonized appearance of its results. For pathway drawing, location information is essential for its comprehension. However, complex shapes need to be taken into account when torus-shaped location information such as nuclear inner membrane, nuclear outer membrane, and plasma membrane is considered. Unfortunately, the spring embedder algorithm cannot easily handle such information. In addition, crossings between edges and nodes are usually not considered explicitly.ResultsWe proposed a new grid-layout algorithm based on the spring embedder algorithm that can handle location information and provide layouts with harmonized appearance. In grid-layout algorithms, the mapping of nodes to grid points that minimizes a cost function is searched. By imposing positional constraints on grid points, location information including complex shapes can be easily considered. Our layout algorithm includes the spring embedder cost as a component of the cost function. We further extend the layout algorithm to enable dynamic update of the positions and sizes of compartments at each step.ConclusionsThe new spring embedder-based grid-layout algorithm and a spring embedder algorithm are applied to three biological pathways; endothelial cell model, Fas-induced apoptosis model, and C. elegans cell fate simulation model. From the positional constraints, all the results of our algorithm satisfy location information, and hence, more comprehensible layouts are obtained as compared to the spring embedder algorithm. From the comparison of the number of crossings, the results of the grid-layout-based algorithm tend to contain more crossings than those of the spring embedder algorithm due to the positional constraints. For a fair comparison, we also apply our proposed method without positional constraints. This comparison shows that these results contain less crossings than those of the spring embedder algorithm. We also compared layouts of the proposed algorithm with and without compartment update and verified that latter can reach better local optima.

bioinformatics and bioengineering | 2010

Identifying Hidden Confounders in Gene Networks by Bayesian Networks

Tomoya Higashigaki; Kaname Kojima; Rui Yamaguchi; Masato Inoue; Seiya Imoto; Satoru Miyano

In the estimation of gene networks from microarray gene expression data, we propose a statistical method for quantification of the hidden confounders in gene networks, which were possibly removed from the set of genes on the gene networks or are novel biological elements that are not measured by microarrays. Due to high computational cost of the structural learning of Bayesian networks and the limited source of the microarray data, it is usual to perform gene selection prior to the estimation of gene networks. Therefore, there exist missing genes that decrease accuracy and interpretability of the estimated gene networks. The proposed method can identify hidden confounders based on the conflicts of the estimated local Bayesian network structures and estimate their ideal profiles based on the proposed Bayesian networks with hidden variables with an EM algorithm. From the estimated ideal profiles, we can identify genes which are missing in the network or suggest the existence of the novel biological elements if the ideal profiles are not significantly correlated with any expression profiles of genes. To the best of our knowledge, this research is the first study to theoretically characterize missing genes in gene networks and practically utilize this information to refine network estimation.

Explore More