Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Byunghan Lee is active.

Publication


Featured researches published by Byunghan Lee.


Briefings in Bioinformatics | 2016

Deep learning in bioinformatics

Seonwoo Min; Byunghan Lee; Sungroh Yoon

In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies.


IEEE Access | 2016

Biometric Authentication Using Noisy Electrocardiograms Acquired by Mobile Sensors

Hyun-Soo Choi; Byunghan Lee; Sungroh Yoon

Electrocardiogram (ECG) signals from mobile sensors are expected to increase the availability of authentication in the emerging wearable device industry. However, mobile sensors provide a relatively lower quality signal than the conventional medical devices. This paper proposes a practical authentication procedure for ECG signals that collected via one-chip-solution mobile sensors. We designed a cascading bandpass filter for noise cancellation and suggest eight fiducial features. For classification-based authentication, we use the radial basis function kernel-based support vector machine showing the best performance among nine classifiers through experimental comparisons. In spite of noisy ECG signals in mobile sensors, we achieved 4.61% of the equal error rate (EER) on a single heartbeat, and 1.87% of EER on 15 s testing time on 175 subjects, which is a reasonable result and supports the usability of low-cost ECGs for biometric authentication.


international conference on bioinformatics | 2016

deepTarget: End-to-end Learning Framework for microRNA Target Prediction using Deep Recurrent Neural Networks

Byunghan Lee; Junghwan Baek; Seunghyun Park; Sungroh Yoon

MicroRNAs (miRNAs) are short sequences of ribonucleic acids that control the expression of target messenger RNAs (mRNAs) by binding them. Robust prediction of miRNA-mRNA pairs is of utmost importance in deciphering gene regulation but has been challenging because of high false positive rates, despite a deluge of computational tools that normally require laborious manual feature extraction. This paper presents an end-to-end machine learning framework for miRNA target prediction. Leveraged by deep recurrent neural networks-based auto-encoding and sequence-sequence interaction learning, our approach not only delivers an unprecedented level of accuracy but also eliminates the need for manual feature extraction. The performance gap between the proposed method and existing alternatives is substantial (over 25% increase in F-measure), and deepTarget delivers a quantum leap in the longstanding challenge of robust miRNA target prediction. [availability: http://data.snu.ac.kr/pub/deepTarget]


BMC Bioinformatics | 2014

CASPER: context-aware scheme for paired-end reads from high-throughput amplicon sequencing

Sunyoung Kwon; Byunghan Lee; Sungroh Yoon

Merging the forward and reverse reads from paired-end sequencing is a critical task that can significantly improve the performance of downstream tasks, such as genome assembly and mapping, by providing them with virtually elongated reads. However, due to the inherent limitations of most paired-end sequencers, the chance of observing erroneous bases grows rapidly as the end of a read is approached, which becomes a critical hurdle for accurately merging paired-end reads. Although there exist several sophisticated approaches to this problem, their performance in terms of quality of merging often remains unsatisfactory. To address this issue, here we present a c ontext-a ware scheme for p aired-e nd r eads (CASPER): a computational method to rapidly and robustly merge overlapping paired-end reads. Being particularly well suited to amplicon sequencing applications, CASPER is thoroughly tested with both simulated and real high-throughput amplicon sequencing data. According to our experimental results, CASPER significantly outperforms existing state-of-the art paired-end merging tools in terms of accuracy and robustness. CASPER also exploits the parallelism in the task of paired-end merging and effectively speeds up by multithreading. CASPER is freely available for academic use at http://best.snu.ac.kr/casper.


international conference of the ieee engineering in medicine and biology society | 2013

In-depth analysis of interrelation between quality scores and real errors in illumina reads

Sunyoung Kwon; Seunghyun Park; Byunghan Lee; Sungroh Yoon

In sequencing results, the quality score is reported for each base, representing the probability that the base is called incorrectly. The notion of quality scores was initially developed for conventional Sanger sequencing, but is widely used for next-generation sequencing techniques, including Illumina. In this paper, we carry out in-depth analysis of quality scores reported for Illumina reads and present how they are related to real errors in the reads. We confirmed strong interrelation between quality scores and real errors in Illumina reads, and observed that reverse reads tend to have lower quality scores than forward reads in paired-end reads do. In addition, we discovered other interesting patterns from quality score analysis. Our hope is that the findings in this paper will be helpful for designing error-correction and/or filtering methods for next-generation sequencing.


IEEE Transactions on Power Delivery | 1998

The use of rational B-spline surface to improve the shape control for three-dimensional insulation design and its application to design of shield ring

Byunghan Lee; S.H. Myung; Jong Keun Park; S.W. Min; E.S. Kim

In this paper, a three-dimensional algorithm for the insulation design of the high-voltage equipment is presented. In general, the insulation design consists of two steps. One is the electric field calculation. The other is the correction of the shape of the electrode or the insulator to be designed. The main point of this paper is the introduction of the rational B-spline to improve the shape control for the insulation design. In former research, the correction of a shape is made by direct control with the nodes on the shape. This required many design variables and causes the discontinuous curvature of a designed shape. These problems can be dealt with by the indirect control which is one of the useful properties of the rational B-spline. The use of rational B-spline results in the reduction in the number of design variables and guarantees the smooth curvature of the designed shape. The proposed algorithm is applied to the design of the shape of the shield ring which is installed in the joint of transmission lines and the insulator and has been designed by the method of trial and error. The combination method of charge simulation method (CSM) and surface charge simulation method (SCSM) is used to calculate the three-dimensional electric fields produced by this system.


PLOS ONE | 2017

DUDE-Seq: Fast, flexible, and robust denoising for targeted amplicon sequencing

Byunghan Lee; Taesup Moon; Sungroh Yoon; Tsachy Weissman

We consider the correction of errors from nucleotide sequences produced by next-generation targeted amplicon sequencing. The next-generation sequencing (NGS) platforms can provide a great deal of sequencing data thanks to their high throughput, but the associated error rates often tend to be high. Denoising in high-throughput sequencing has thus become a crucial process for boosting the reliability of downstream analyses. Our methodology, named DUDE-Seq, is derived from a general setting of reconstructing finite-valued source data corrupted by a discrete memoryless channel and effectively corrects substitution and homopolymer indel errors, the two major types of sequencing errors in most high-throughput targeted amplicon sequencing platforms. Our experimental studies with real and simulated datasets suggest that the proposed DUDE-Seq not only outperforms existing alternatives in terms of error-correction capability and time efficiency, but also boosts the reliability of downstream analyses. Further, the flexibility of DUDE-Seq enables its robust application to different sequencing platforms and analysis pipelines by simple updates of the noise model. DUDE-Seq is available at http://data.snu.ac.kr/pub/dude-seq.


Bioinformatics | 2018

LncRNAnet: long non-coding RNA identification using deep learning

Junghwan Baek; Byunghan Lee; Sunyoung Kwon; Sungroh Yoon

Motivation Long non‐coding RNAs (lncRNAs) are important regulatory elements in biological processes. LncRNAs share similar sequence characteristics with messenger RNAs, but they play completely different roles, thus providing novel insights for biological studies. The development of next‐generation sequencing has helped in the discovery of lncRNA transcripts. However, the experimental verification of numerous transcriptomes is time consuming and costly. To alleviate these issues, a computational approach is needed to distinguish lncRNAs from the transcriptomes. Results We present a deep learning‐based approach, lncRNAnet, to identify lncRNAs that incorporates recurrent neural networks for RNA sequence modeling and convolutional neural networks for detecting stop codons to obtain an open reading frame indicator. lncRNAnet performed clearly better than the other tools for sequences of short lengths, on which most lncRNAs are distributed. In addition, lncRNAnet successfully learned features and showed 7.83%, 5.76%, 5.30% and 3.78% improvements over the alternatives on a human test set in terms of specificity, accuracy, F1‐score and area under the curve, respectively. Availability and implementation Data and codes are available in http://data.snu.ac.kr/pub/lncRNAnet.


Bioinformatics | 2018

MUGAN: multi-GPU accelerated AmpliconNoise server for rapid microbial diversity assessment

Byunghan Lee; Hyeyoung Min; Sungroh Yoon

Motivation Metagenomic sequencing has become a crucial tool for obtaining a gene catalogue of operational taxonomic units (OTUs) in a microbial community. A typical metagenomic sequencing produces a large amount of data (often in the order of terabytes or more), and computational tools are indispensable for efficient processing. In particular, error correction in metagenomics is crucial for accurate and robust genetic cataloging of microbial communities. However, many existing error-correction tools take a prohibitively long time and often bottleneck the whole analysis pipeline. Results To overcome this computational hurdle, we analyzed and exploited the data-level parallelism that exists in the error-correction procedure and proposed a tool named MUGAN that exploits both multi-core central processing units (CPUs) and multiple graphics processing units (GPUs) for co-processing. According to the experimental results, our approach reduced not only the time demand for denoising amplicons from approximately 59 hours to only 46 minutes, but also the overestimation of the number of OTUs, estimating 6.7 times less species-level OTUs than the baseline. In addition, our approach provides web-based intuitive visualization of results. Given its efficiency and convenience, we anticipate that our approach would greatly facilitate denoising efforts in metagenomics studies. Availability http://data.snu.ac.kr/pub/mugan. Contact [email protected]. Supplementary information Supplementary data are available at Bioinformatics online.


international conference on data mining | 2012

Rapid and Robust Denoising of Pyrosequenced Amplicons for Metagenomics

Byunghan Lee; Joonhong Park; Sungroh Yoon

Metagenomic sequencing has become a crucial tool for obtaining a gene catalogue of operational taxonomic units (OTUs) in a microbial community. High-throughput pyrosequencing is a next-generation sequencing technique very popular in microbial community analysis due to its longer read length compared to alternative methods. Computational tools are inevitable to process raw data from pyrosequencers, and in particular, noise removal is a critical data-mining step to obtain robust sequence reads. However, the slow rate of existing denoisers has bottlenecked the whole pyrosequencing process, let alone hindering efforts to improve robustness. To address these, we propose a new approach that can accelerate the denoising process substantially. By using our approach, it now takes only about 2 hours to denoise 62,873 pyrosequenced amplicons from a mixture of 91 full-length 16S rRNA clones. It would otherwise take nearly 2.5 days if existing software tools were used. Furthermore, our approach can effectively reduce overestimating the number of OTUs, producing 6.7 times fewer species-level OTUs on average than a state-of-the-art alternative under the same condition. Leveraged by our approach, we hope that metagenomic sequencing will become an even more appealing tool for microbial community analysis.

Collaboration


Dive into the Byunghan Lee's collaboration.

Top Co-Authors

Avatar

Sungroh Yoon

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Sunyoung Kwon

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Seunghyun Park

Seoul National University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hyun-Soo Choi

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Jong Keun Park

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Junghwan Baek

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

N Choi

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Seonwoo Min

Seoul National University

View shared research outputs
Researchain Logo
Decentralizing Knowledge