Yukuto Sato
Tohoku University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yukuto Sato.
Nature Communications | 2015
Masao Nagasaki; Jun Yasuda; Fumiki Katsuoka; Naoki Nariai; Kaname Kojima; Yosuke Kawai; Yumi Yamaguchi-Kabata; Junji Yokozawa; Inaho Danjoh; Sakae Saito; Yukuto Sato; Takahiro Mimori; Kaoru Tsuda; Rumiko Saito; Xiaoqing Pan; Satoshi Nishikawa; Shin Ito; Yoko Kuroki; Osamu Tanabe; Nobuo Fuse; Shinichi Kuriyama; Hideyasu Kiyomoto; Atsushi Hozawa; Naoko Minegishi; James Douglas Engel; Kengo Kinoshita; Shigeo Kure; Nobuo Yaegashi; Akito Tsuboi; Fuji Nagami
The Tohoku Medical Megabank Organization reports the whole-genome sequences of 1,070 healthy Japanese individuals and construction of a Japanese population reference panel (1KJPN). Here we identify through this high-coverage sequencing (32.4 × on average), 21.2 million, including 12 million novel, single-nucleotide variants (SNVs) at an estimated false discovery rate of <1.0%. This detailed analysis detected signatures for purifying selection on regulatory elements as well as coding regions. We also catalogue structural variants, including 3.4 million insertions and deletions, and 25,923 genic copy-number variants. The 1KJPN was effective for imputing genotypes of the Japanese population genome wide. These data demonstrate the value of high-coverage sequencing for constructing population-specific variant panels, which covers 99.0% SNVs of minor allele frequency ≥0.1%, and its value for identifying causal rare variants of complex human disease phenotypes in genetic association studies.
Royal Society Open Science | 2015
Masaki Miya; Yukuto Sato; Tsukasa Fukunaga; Tetsuya Sado; J. Y. Poulsen; Kodai Sato; Toshifumi Minamoto; Satoshi Yamamoto; Hiroki Yamanaka; Hitoshi Araki; Michio Kondoh; Wataru Iwasaki
We developed a set of universal PCR primers (MiFish-U/E) for metabarcoding environmental DNA (eDNA) from fishes. Primers were designed using aligned whole mitochondrial genome (mitogenome) sequences from 880 species, supplemented by partial mitogenome sequences from 160 elasmobranchs (sharks and rays). The primers target a hypervariable region of the 12S rRNA gene (163–185 bp), which contains sufficient information to identify fishes to taxonomic family, genus and species except for some closely related congeners. To test versatility of the primers across a diverse range of fishes, we sampled eDNA from four tanks in the Okinawa Churaumi Aquarium with known species compositions, prepared dual-indexed libraries and performed paired-end sequencing of the region using high-throughput next-generation sequencing technologies. Out of the 180 marine fish species contained in the four tanks with reference sequences in a custom database, we detected 168 species (93.3%) distributed across 59 families and 123 genera. These fishes are not only taxonomically diverse, ranging from sharks and rays to higher teleosts, but are also greatly varied in their ecology, including both pelagic and benthic species living in shallow coastal to deep waters. We also sampled natural seawaters around coral reefs near the aquarium and detected 93 fish species using this approach. Of the 93 species, 64 were not detected in the four aquarium tanks, rendering the total number of species detected to 232 (from 70 families and 152 genera). The metabarcoding approach presented here is non-invasive, more efficient, more cost-effective and more sensitive than the traditional survey methods. It has the potential to serve as an alternative (or complementary) tool for biodiversity monitoring that revolutionizes natural resource management and ecological studies of fish communities on larger spatial and temporal scales.
PLOS ONE | 2013
Masaki Miya; Matt Friedman; Takashi P. Satoh; Hirohiko Takeshima; Tetsuya Sado; Wataru Iwasaki; Yusuke Yamanoue; Masanori Nakatani; Kohji Mabuchi; Jun Inoue; Jan Yde Poulsen; Tsukasa Fukunaga; Yukuto Sato; Mutsumi Nishida
Uncertainties surrounding the evolutionary origin of the epipelagic fish family Scombridae (tunas and mackerels) are symptomatic of the difficulties in resolving suprafamilial relationships within Percomorpha, a hyperdiverse teleost radiation that contains approximately 17,000 species placed in 13 ill-defined orders and 269 families. Here we find that scombrids share a common ancestry with 14 families based on (i) bioinformatic analyses using partial mitochondrial and nuclear gene sequences from all percomorphs deposited in GenBank (10,733 sequences) and (ii) subsequent mitogenomic analysis based on 57 species from those targeted 15 families and 67 outgroup taxa. Morphological heterogeneity among these 15 families is so extraordinary that they have been placed in six different perciform suborders. However, members of the 15 families are either coastal or oceanic pelagic in their ecology with diverse modes of life, suggesting that they represent a previously undetected adaptive radiation in the pelagic realm. Time-calibrated phylogenies imply that scombrids originated from a deep-ocean ancestor and began to radiate after the end-Cretaceous when large predatory epipelagic fishes were selective victims of the Cretaceous-Paleogene mass extinction. We name this clade of open-ocean fishes containing Scombridae “Pelagia” in reference to the common habitat preference that links the 15 families.
Environmental Biology of Fishes | 2010
Yukuto Sato; Mutsumi Nishida
Whole-genome duplication (WGD) is believed to be one of the major evolutionary events that shaped the genome organization of vertebrates. Here, we review recent research on vertebrate genome evolution, specifically on WGD and its consequences for gene and genome evolution in teleost fishes. Recent genome analyses confirmed that all vertebrates experienced two rounds of WGD early in their evolution, and that teleosts experienced a subsequent additional third-round (3R)-WGD. The 3R-WGD was estimated to have occurred 320–400 million years ago in a teleost ancestor, but after its divergence from a common ancestor with living non-teleost actinopterygians (Bichir, Sturgeon, Bowfin, and Gar) based on the analyses of teleost-specific duplicate genes. This 3R-WGD was confirmed by synteny analysis and ancestral karyotype inference using the genome sequences of Tetraodon and medaka. Most of the tetrapods, on the other hand, have not experienced an additional WGD; however, they have experienced repeated chromosomal rearrangements throughout the whole genome. Therefore, different types of chromosomal events have characterized the genomes of teleosts and tetrapods, respectively. The 3R-WGD is useful to investigate the consequences of WGD because it is an evolutionarily recent WGD and thus teleost genomes retain many more WGD-derived duplicates and “traces” of their evolution. In addition, the remarkable morphological, physiological, and ecological diversity of teleosts may facilitate understanding of macrophenotypic evolution on the basis of genetic/genomic information. We highlight the teleosts with 3R-WGD as unique models for future studies on ecology and evolution taking advantage of emerging genomics technologies and systems biology environments.
BMC Evolutionary Biology | 2009
Yukuto Sato; Yasuyuki Hashiguchi; Mutsumi Nishida
BackgroundRecent genomic studies have revealed a teleost-specific third-round whole genome duplication (3R-WGD) event occurred in a common ancestor of teleost fishes. However, it is unclear how the genes duplicated in this event were lost or persisted during the diversification of teleosts, and therefore, how many of the duplicated genes contribute to the genetic differences among teleosts. This subject is also important for understanding the process of vertebrate evolution through WGD events. We applied a comparative evolutionary approach to this question by focusing on the genes involved in long-term potentiation, taste and olfactory transduction, and the tricarboxylic acid cycle, based on the whole genome sequences of four teleosts; zebrafish, medaka, stickleback, and green spotted puffer fish.ResultsWe applied a state-of-the-art method of maximum-likelihood phylogenetic inference and conserved synteny analyses to each of 130 genes involved in the above biological systems of human. These analyses identified 116 orthologous gene groups between teleosts and tetrapods, and 45 pairs of 3R-WGD-derived duplicate genes among them. This suggests that more than half [(45×2)/(116+45)] = 56.5%) of the loci, probably more than ten thousand genes, present in a common ancestor of the four teleosts were still duplicated after the 3R-WGD. The estimated temporal pattern of gene loss suggested that, after the 3R-WGD, many (71/116) of the duplicated genes were rapidly lost during the initial 75 million years (MY), whereas on average more than half (27.3/45) of the duplicated genes remaining in the ancestor of the four teleosts (45/116) have persisted for about 275 MY. The 3R-WGD-derived duplicates that have persisted for a long evolutionary periods of time had significantly larger number of interacting partners and longer length of protein coding sequence, implying that they tend to be more multifunctional than the singletons after the 3R-WGD.ConclusionWe have shown firstly the temporal pattern of gene loss process after 3R-WGD on the basis of teleost phylogeny and divergence time frameworks. The 3R-WGD-derived duplicates have not undergone constant exponential decay, suggesting that selection favoured the long-term persistence of a subset of duplicates that tend to be multi-functional. On the basis of these results obtained from the analysis of 116 orthologous gene groups, we propose that more than ten thousand of 3R-WGD-derived duplicates have experienced lineage-specific evolution, that is, the differential sub-/neo-functionalization or secondary loss between lineages, and contributed to teleost diversity.
Proceedings of the National Academy of Sciences of the United States of America | 2015
Jun Inoue; Yukuto Sato; Robert Sinclair; Katsumi Tsukamoto; Mutsumi Nishida
Significance All genes are duplicated by whole-genome duplication (WGD), reverting in number over time, but the actual timing of genome reshaping through gene loss remains poorly understood. We estimated the spatiotemporal loss/persistence pattern of 6,892 gene lineage pairs after the teleost-specific WGD, using careful orthology assignment and a reliable time-calibrated tree. We found that massive gene loss did occur in the first 60 My, mainly due to events involving the simultaneous loss of multiple redundant genes, and the rate of loss then slowed to an approximately constant level for the subsequent 250 My. Similar genomic gene arrangements within teleosts imply that rapid gene loss led to the reshaping of the teleost genomes before their major divergence. Whole-genome duplication (WGD) is believed to be a significant source of major evolutionary innovation. Redundant genes resulting from WGD are thought to be lost or acquire new functions. However, the rates of gene loss and thus temporal process of genome reshaping after WGD remain unclear. The WGD shared by all teleost fish, one-half of all jawed vertebrates, was more recent than the two ancient WGDs that occurred before the origin of jawed vertebrates, and thus lends itself to analysis of gene loss and genome reshaping. Using a newly developed orthology identification pipeline, we inferred the post–teleost-specific WGD evolutionary histories of 6,892 protein-coding genes from nine phylogenetically representative teleost genomes on a time-calibrated tree. We found that rapid gene loss did occur in the first 60 My, with a loss of more than 70–80% of duplicated genes, and produced similar genomic gene arrangements within teleosts in that relatively short time. Mathematical modeling suggests that rapid gene loss occurred mainly by events involving simultaneous loss of multiple genes. We found that the subsequent 250 My were characterized by slow and steady loss of individual genes. Our pipeline also identified about 1,100 shared single-copy genes that are inferred to have become singletons before the divergence of clupeocephalan teleosts. Therefore, our comparative genome analysis suggests that rapid gene loss just after the WGD reshaped teleost genomes before the major divergence, and provides a useful set of marker genes for future phylogenetic analysis.
Scientific Reports | 2017
Satoshi Yamamoto; Reiji Masuda; Yukuto Sato; Tetsuya Sado; Hitoshi Araki; Michio Kondoh; Toshifumi Minamoto; Masaki Miya
Environmental DNA (eDNA) metabarcoding has emerged as a potentially powerful tool to assess aquatic community structures. However, the method has hitherto lacked field tests that evaluate its effectiveness and practical properties as a biodiversity monitoring tool. Here, we evaluated the ability of eDNA metabarcoding to reveal fish community structures in species-rich coastal waters. High-performance fish-universal primers and systematic spatial water sampling at 47 stations covering ~11 km2 revealed the fish community structure at a species resolution. The eDNA metabarcoding based on a 6-h collection of water samples detected 128 fish species, of which 62.5% (40 species) were also observed by underwater visual censuses conducted over a 14-year period. This method also detected other local fishes (≥23 species) that were not observed by the visual censuses. These eDNA metabarcoding features will enhance marine ecosystem-related research, and the method will potentially become a standard tool for surveying fish communities.
BMC Genomics | 2014
Naoki Nariai; Kaname Kojima; Takahiro Mimori; Yukuto Sato; Yosuke Kawai; Yumi Yamaguchi-Kabata; Masao Nagasaki
BackgroundHigh-throughput RNA sequencing (RNA-Seq) enables quantification and identification of transcripts at single-base resolution. Recently, longer sequence reads become available thanks to the development of new types of sequencing technologies as well as improvements in chemical reagents for the Next Generation Sequencers. Although several computational methods have been proposed for quantifying gene expression levels from RNA-Seq data, they are not sufficiently optimized for longer reads (e.g. > 250 bp).ResultsWe propose TIGAR2, a statistical method for quantifying transcript isoforms from fixed and variable length RNA-Seq data. Our method models substitution, deletion, and insertion errors of sequencers based on gapped-alignments of reads to the reference cDNA sequences so that sensitive read-aligners such as Bowtie2 and BWA-MEM are effectively incorporated in our pipeline. Also, a heuristic algorithm is implemented in variational Bayesian inference for faster computation. We apply TIGAR2 to both simulation data and real data of human samples and evaluate performance of transcript quantification with TIGAR2 in comparison to existing methods.ConclusionsTIGAR2 is a sensitive and accurate tool for quantifying transcript isoform abundances from RNA-Seq data. Our method performs better than existing methods for the fixed-length reads (100 bp, 250 bp, 500 bp, and 1000 bp of both single-end and paired-end) and variable-length reads, especially for reads longer than 250 bp.
Human genome variation | 2015
Yumi Yamaguchi-Kabata; Naoki Nariai; Yosuke Kawai; Yukuto Sato; Kaname Kojima; Minoru Tateno; Fumiki Katsuoka; Jun Yasuda; Masayuki Yamamoto; Masao Nagasaki
The integrative Japanese Genome Variation Database (iJGVD; http://ijgvd.megabank.tohoku.ac.jp/) provides genomic variation data detected by whole-genome sequencing (WGS) of Japanese individuals. Specifically, the database contains variants detected by WGS of 1,070 individuals who participated in a genome cohort study of the Tohoku Medical Megabank Project. In the first release, iJGVD includes >4,300,000 autosomal single nucleotide variants (SNVs) whose minor allele frequencies are >5.0%.
BMC Genomics | 2015
Naoki Nariai; Kaname Kojima; Sakae Saito; Takahiro Mimori; Yukuto Sato; Yosuke Kawai; Yumi Yamaguchi-Kabata; Jun Yasuda; Masao Nagasaki
BackgroundHuman leucocyte antigen (HLA) genes play an important role in determining the outcome of organ transplantation and are linked to many human diseases. Because of the diversity and polymorphisms of HLA loci, HLA typing at high resolution is challenging even with whole-genome sequencing data.ResultsWe have developed a computational tool, HLA-VBSeq, to estimate the most probable HLA alleles at full (8-digit) resolution from whole-genome sequence data. HLA-VBSeq simultaneously optimizes read alignments to HLA allele sequences and abundance of reads on HLA alleles by variational Bayesian inference. We show the effectiveness of the proposed method over other methods through the analysis of predicting HLA types for HLA class I (HLA-A, -B and -C) and class II (HLA-DQA1,-DQB1 and -DRB1) loci from the simulation data of various depth of coverage, and real sequencing data of human trio samples.ConclusionsHLA-VBSeq is an efficient and accurate HLA typing method using high-throughput sequencing data without the need of primer design for HLA loci. Moreover, it does not assume any prior knowledge about HLA allele frequencies, and hence HLA-VBSeq is broadly applicable to human samples obtained from a genetically diverse population.