Md. Altaf-Ul-Amin
Nara Institute of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Md. Altaf-Ul-Amin.
Nucleic Acids Research | 2011
Kensuke Nakamura; Taku Oshima; Takuya Morimoto; Shun Ikeda; Hirofumi Yoshikawa; Yuh Shiwa; Shu Ishikawa; Margaret C. Linak; Aki Hirai; Hiroki Takahashi; Md. Altaf-Ul-Amin; Naotake Ogasawara; Shigehiko Kanaya
We identified the sequence-specific starting positions of consecutive miscalls in the mapping of reads obtained from the Illumina Genome Analyser (GA). Detailed analysis of the miscall pattern indicated that the underlying mechanism involves sequence-specific interference of the base elongation process during sequencing. The two major sequence patterns that trigger this sequence-specific error (SSE) are: (i) inverted repeats and (ii) GGC sequences. We speculate that these sequences favor dephasing by inhibiting single-base elongation, by: (i) folding single-stranded DNA and (ii) altering enzyme preference. This phenomenon is a major cause of sequence coverage variability and of the unfavorable bias observed for population-targeted methods such as RNA-seq and ChIP-seq. Moreover, SSE is a potential cause of false single-nucleotide polymorphism (SNP) calls and also significantly hinders de novo assembly. This article highlights the importance of recognizing SSE and its underlying mechanisms in the hope of enhancing the potential usefulness of the Illumina sequencers.
Plant and Cell Physiology | 2012
Farit Mochamad Afendi; Taketo Okada; Mami Yamazaki; Aki Hirai-Morita; Yukiko Nakamura; Kensuke Nakamura; Shun Ikeda; Hiroki Takahashi; Md. Altaf-Ul-Amin; Latifah Kosim Darusman; Kazuki Saito; Shigehiko Kanaya
A database (DB) describing the relationships between species and their metabolites would be useful for metabolomics research, because it targets systematic analysis of enormous numbers of organic compounds with known or unknown structures in metabolomics. We constructed an extensive species-metabolite DB for plants, the KNApSAcK Core DB, which contains 101,500 species-metabolite relationships encompassing 20,741 species and 50,048 metabolites. We also developed a search engine within the KNApSAcK Core DB for use in metabolomics research, making it possible to search for metabolites based on an accurate mass, molecular formula, metabolite name or mass spectra in several ionization modes. We also have developed databases for retrieving metabolites related to plants used for a range of purposes. In our multifaceted plant usage DB, medicinal/edible plants are related to the geographic zones (GZs) where the plants are used, their biological activities, and formulae of Japanese and Indonesian traditional medicines (Kampo and Jamu, respectively). These data are connected to the species-metabolites relationship DB within the KNApSAcK Core DB, keyed via the species names. All databases can be accessed via the website http://kanaya.naist.jp/KNApSAcK_Family/. KNApSAcK WorldMap DB comprises 41,548 GZ-plant pair entries, including 222 GZs and 15,240 medicinal/edible plants. The KAMPO DB consists of 336 formulae encompassing 278 medicinal plants; the JAMU DB consists of 5,310 formulae encompassing 550 medicinal plants. The Biological Activity DB consists of 2,418 biological activities and 33,706 pairwise relationships between medicinal plants and their biological activities. Current statistics of the binary relationships between individual databases were characterized by the degree distribution analysis, leading to a prediction of at least 1,060,000 metabolites within all plants. In the future, the study of metabolomics will need to take this huge number of metabolites into consideration.
Archive | 2006
Y. Shinbo; Yukiko Nakamura; Md. Altaf-Ul-Amin; Hiroko Asahi; Ken Kurokawa; Masanori Arita; Kazuki Saito; Daisaku Ohta; D. Shibata; Shigehiko Kanaya
We prepared a database, KNApSAcK for accumulation and search of metabolite-species relationships. The power-law distribution observed in the present study is likely to be associated with research activity for finding novel metabolites from nature. In addition, it seems to be derived from searching rare metabolites from the organisms originally exhibiting power-law in the degree distribution of their metabolic networks. This suggests that the database contains chemical diversity of metabolites which occurred through evolution of species. Graph clustering is shown to be useful to extract taxonomic relationships on the basis of common metabolites. As we are continuously accumulating metabolite-species pairs in the database, we continue to advance our understanding of species-metabolite relations in taxonomic hierarchy. Furthermore, we plan to add an option for searching metabolite structures by entering partial structures, which will be helpful for metabolite research.
Analytical and Bioanalytical Chemistry | 2008
Hiroki Takahashi; Kosuke Kai; Yoko Shinbo; Kenichi Tanaka; Daisaku Ohta; Taku Oshima; Md. Altaf-Ul-Amin; Ken Kurokawa; Naotake Ogasawara; Shigehiko Kanaya
Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR/MS) is the best MS technology for obtaining exact mass measurements owing to its great resolution and accuracy, and several outstanding FT-ICR/MS-based metabolomics approaches have been reported. A reliable annotation scheme is needed to deal with direct-infusion FT-ICR/MS metabolic profiling. Correlation analyses can help us not only uncover relations between the ions but also annotate the ions originated from identical metabolites (metabolite derivative ions). In the present study, we propose a procedure for metabolite annotation on direct-infusion FT-ICR/MS by taking into consideration the classification of metabolite-derived ions using correlation analyses. Integrated analysis based on information of isotope relations, fragmentation patterns by MS/MS analysis, co-occurring metabolites, and database searches (KNApSAcK and KEGG) can make it possible to annotate ions as metabolites and estimate cellular conditions based on metabolite composition. A total of 220 detected ions were classified into 174 metabolite derivative groups and 72 ions were assigned to candidate metabolites in the present work. Finally, metabolic profiling has been able to distinguish between the growth stages with the aid of PCA. The constructed model using PLS regression for OD600 values as a function of metabolic profiles is very useful for identifying to what degree the ions contribute to the growth stages. Ten phospholipids which largely influence the constructed model are highly abundant in the cells. Our analyses reveal that global modification of those phospholipids occurs as E. coli enters the stationary phase. Thus, the integrated approach involving correlation analyses, metabolic profiling, and database searching is efficient for high-throughput metabolomics.
Current Computer - Aided Drug Design | 2010
Taketo Okada; Farit Mochamad Afendi; Md. Altaf-Ul-Amin; Hiroki Takahashi; Kensuke Nakamura; Shigehiko Kanaya
Metabolomics, the comprehensive and global analysis of diverse metabolites produced in cells and organisms, has greatly expanded metabolite fingerprinting and profiling as well as the selection and identification of marker metabolites. The methodology typically employs multivariate analysis to statistically process the massive amount of analytical chemistry data resulting from high-throughput and simultaneous metabolite analysis. Although the technology of plant metabolomics has mainly developed with other post-genomics in systems biology and functional genomics, it is independently applied to the evaluation of the qualities of medicinal plants, based on the diversity of metabolite fingerprints resulting from multivariate analysis of non-targeted or widely targeted metabolite analysis. One advantage of applying metabolomics is that medicinal plants are evaluated based not only on the limited number of metabolites that are pharmacologically important chemicals, but also on the fingerprints of minor metabolites and bioactive chemicals. In particular, score plot and loading plot analyses e.g. principal component analysis (PCA), partial-least-squares discriminant analysis (PLS-DA), and discrimination map analysis such as batch-learning self-organizing map (BL-SOM) analysis, are often employed for the reduction of a metabolite fingerprint and the classification of analyzed samples. Based on recent studies, we now understand that metabolomics can be an effective approach for comprehensive evaluation of the qualities of medicinal plants. In this review, we describe practical cases in which metabolomic study was performed on medicinal plants, and discuss the utility of metabolomics for this research field, with focus on multivariate analysis.
Plant and Cell Physiology | 2014
Yukiko Nakamura; Farit Mochamad Afendi; Aziza Kawsar Parvin; Naoaki Ono; Ken Tanaka; Aki Hirai Morita; Tetsuo Sato; Tadao Sugiura; Md. Altaf-Ul-Amin; Shigehiko Kanaya
Databases (DBs) are required by various omics fields because the volume of molecular biology data is increasing rapidly. In this study, we provide instructions for users and describe the current status of our metabolite activity DB. To facilitate a comprehensive understanding of the interactions between the metabolites of organisms and the chemical-level contribution of metabolites to human health, we constructed a metabolite activity DB known as the KNApSAcK Metabolite Activity DB. It comprises 9,584 triplet relationships (metabolite-biological activity-target species), including 2,356 metabolites, 140 activity categories, 2,963 specific descriptions of biological activities and 778 target species. Approximately 46% of the activities described in the DB are related to chemical ecology, most of which are attributed to antimicrobial agents and plant growth regulators. The majority of the metabolites with antimicrobial activities are flavonoids and phenylpropanoids. The metabolites with plant growth regulatory effects include plant hormones. Over half of the DB contents are related to human health care and medicine. The five largest groups are toxins, anticancer agents, nervous system agents, cardiovascular agents and non-therapeutic agents, such as flavors and fragrances. The KNApSAcK Metabolite Activity DB is integrated within the KNApSAcK Family DBs to facilitate further systematized research in various omics fields, especially metabolomics, nutrigenomics and foodomics. The KNApSAcK Metabolite Activity DB could also be utilized for developing novel drugs and materials, as well as for identifying viable drug resources and other useful compounds.
BioMed Research International | 2014
Md. Altaf-Ul-Amin; Farit Mochamad Afendi; Samuel Kiboi; Shigehiko Kanaya
Science is going through two rapidly changing phenomena: one is the increasing capabilities of the computers and software tools from terabytes to petabytes and beyond, and the other is the advancement in high-throughput molecular biology producing piles of data related to genomes, transcriptomes, proteomes, metabolomes, interactomes, and so on. Biology has become a data intensive science and as a consequence biology and computer science have become complementary to each other bridged by other branches of science such as statistics, mathematics, physics, and chemistry. The combination of versatile knowledge has caused the advent of big-data biology, network biology, and other new branches of biology. Network biology for instance facilitates the system-level understanding of the cell or cellular components and subprocesses. It is often also referred to as systems biology. The purpose of this field is to understand organisms or cells as a whole at various levels of functions and mechanisms. Systems biology is now facing the challenges of analyzing big molecular biological data and huge biological networks. This review gives an overview of the progress in big-data biology, and data handling and also introduces some applications of networks and multivariate analysis in systems biology.
Gene | 2012
Masayoshi Wada; Hiroki Takahashi; Md. Altaf-Ul-Amin; Kensuke Nakamura; Masami Yokota Hirai; Daisaku Ohta; Shigehiko Kanaya
Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system.
Computational and structural biotechnology journal | 2013
Farit Mochamad Afendi; Naoaki Ono; Yukiko Nakamura; Kensuke Nakamura; Latifah Kosim Darusman; Nelson Kibinge; Aki Hirai Morita; Ken Tanaka; Hisayuki Horai; Md. Altaf-Ul-Amin; Shigehiko Kanaya
Molecular biological data has rapidly increased with the recent progress of the Omics fields, e.g., genomics, transcriptomics, proteomics and metabolomics that necessitates the development of databases and methods for efficient storage, retrieval, integration and analysis of massive data. The present study reviews the usage of KNApSAcK Family DB in metabolomics and related area, discusses several statistical methods for handling multivariate data and shows their application on Indonesian blended herbal medicines (Jamu) as a case study. Exploration using Biplot reveals many plants are rarely utilized while some plants are highly utilized toward specific efficacy. Furthermore, the ingredients of Jamu formulas are modeled using Partial Least Squares Discriminant Analysis (PLS-DA) in order to predict their efficacy. The plants used in each Jamu medicine served as the predictors, whereas the efficacy of each Jamu provided the responses. This model produces 71.6% correct classification in predicting efficacy. Permutation test then is used to determine plants that serve as main ingredients in Jamu formula by evaluating the significance of the PLS-DA coefficients. Next, in order to explain the role of plants that serve as main ingredients in Jamu medicines, information of pharmacological activity of the plants is added to the predictor block. Then N-PLS-DA model, multiway version of PLS-DA, is utilized to handle the three-dimensional array of the predictor block. The resulting N-PLS-DA model reveals that the effects of some pharmacological activities are specific for certain efficacy and the other activities are diverse toward many efficacies. Mathematical modeling introduced in the present study can be utilized in global analysis of big data targeting to reveal the underlying biology.
asian test symposium | 2001
Md. Altaf-Ul-Amin; Satoshi Ohtake; Hideo Fujiwara
Introduces the concept of hierarchical testability, of data paths for delay faults. A definition of a hierarchically two-pattern testable (HTPT) data path is developed. Also, a design for testability (DFT) method is presented to augment a data path to an HTPT one. The DFT method incorporates a graph-based analysis of an HTPT data path and makes use of some graph algorithms. The proposed method can provide similar advantages to the enhanced scan approach at the cost of much lower hardware overhead.