Wan-Sup Cho | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wan-Sup Cho is active.

Explore More

Publication

Featured researches published by Wan-Sup Cho.

IEEE Transactions on Visualization and Computer Graphics | 1995

Octree-R: an adaptive octree for efficient ray tracing

Kyu-Young Whang; Ju-Won Song; Ji-Woong Chang; Ji-Yun Kim; Wan-Sup Cho; Chong-Mok Park; Il-Yeol Song

Ray tracing requires many ray-object intersection tests. A way of reducing the number of ray-object intersection tests is to subdivide the space occupied by objects into many nonoverlapping subregions, called voxels, and to construct an octree for the subdivided space. We propose the Octree-R, an octree-variant data structure for efficient ray tracing. The algorithm for constructing the Octree-R first estimates the number of ray-object intersection tests. Then, it partitions the space along the plane that minimizes the estimated number of ray-object intersection tests. We present the results of experiments for verifying the effectiveness of the Octree-R. In the experiment, the Octree-R provides a 4% to 47% performance gain over the conventional octree. The result shows the more skewed the object distribution (as is typical for real data), the more performance gain the Octree-R achieves.

Journal of Microbiology | 2012

PyroTrimmer: a software with GUI for pre-processing 454 amplicon sequences

Jeongsu Oh; Byung Kwon Kim; Wan-Sup Cho; Soon Gyu Hong; Kyung Mo Kim

The ultimate goal of metagenome research projects is to understand the ecological roles and physiological functions of the microbial communities in a given natural environment. The 454 pyrosequencing platform produces the longest reads among the most widely used next generation sequencing platforms. Since the relatively longer reads of the 454 platform provide more information for identification of microbial sequences, this platform is dedicated to microbial community and population studies. In order to accurately perform the downstream analysis of the 454 multiplex datasets, it is necessary to remove artificially designed sequences located at either ends of individual reads and to correct low-quality sequences. We have developed a program called PyroTrimmer that removes the barcodes, linkers, and primers, trims sequence regions with low quality scores, and filters out low-quality sequence reads. Although these functions have previously been implemented in other programs as well, PyroTrimmer has novelty in terms of the following features: i) more sensitive primer detection using Levenstein distance and global pairwise alignment, ii) the first stand-alone software with a graphic user interface, and iii) various options for trimming and filtering out the low-quality sequence reads. PyroTrimmer, written in JAVA, is compatible with multiple operating systems and can be downloaded free at http://pyrotrimmer.kobic.re.kr.

Information Systems | 1996

A new method for estimating the number of objects satisfying an object-oriented query involving partial participation of classes

Wan-Sup Cho; Chong-Mok Park; Kyu-Young Whang; Sang Hyuk Son

Abstract The intermediate result cardinality — the number of objects satisfying a condition given in a query — is an important factor for estimating the cost of the query in query optimization. In this paper we show that an object-oriented query often involves partial participation of classes in a relationship. We then present a new technique for estimating the intermediate result cardinality in such a query. Partial participation has not been considered seriously in existing techniques. Since the proposed technique uses detailed statistics to accommodate partial participation, it estimates the intermediate result cardinality more accurately than existing ones. We also show that these statistics are easily obtained by using inherent properties of object-oriented databases.

PLOS ONE | 2016

CLUSTOM-CLOUD: In-Memory Data Grid-Based Software for Clustering 16S rRNA Sequence Data in the Cloud Environment.

Jeongsu Oh; Chi-Hwan Choi; Min-Kyu Park; Byung Kwon Kim; Kyuin Hwang; Sang-Heon Lee; Soon Gyu Hong; Arshan Nasir; Wan-Sup Cho; Kyung Mo Kim

High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology–a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr.

international conference on cloud and green computing | 2012

Customer Preference Analysis Based on SNS Data

Jaesung Kim; Minhyeok Yang; Yeongjae Hwang; Sunghyeon Jeon; Kyoung-Ran Kim; In-Sun Jung; Chi-Hawn Choi; Wan-Sup Cho; Jong-Hwa Na

Due to rapid improvement of information technology, the emergence of various information channels such as mobile devices and social media has been producing tremendous amount of data. The evolution of smartphones and social network services (SNS) leads to the big data era. The research for unstructured, large and varied data, has been going on for more systematic and appropriate ways of collection and analysis. In this paper, Twitter data has been collected, stored and analyzed in a multi-dimensional fashion on top of Hadoop platform in order to find out what kind of factors can affect the customer preference for the smartphones. About 600,000 Twitter data has been collected for one month and the analysis result shows the most popular smartphone, the most interesting attributes in the smartphones, and the maker the customers most interested in.

web information systems engineering | 2004

An Efficient OLAP Query Processing Technique Using Measure Attribute Indexes

Tae-Sung Jung; M. S. Ahn; Wan-Sup Cho

We propose an index structure, called measure attribute (MA) index, and a query processing technique to improve OLAP query performance. OLAP queries are extremely complicated due to representing the intricate business logic of the company on a huge quantity of data. This is why the efficient query evaluation becomes a critical issue in OLAP systems. Proposed query processing technique supports an efficient evaluation of the star joins and grouping operators known as the most frequently used but very expensive operators in OLAP queries. The MA index is a variation of the path index in object databases and supports index-only processing for the star joins and grouping operators. Index-only processing is a well known efficient technique in the query evaluation area. We implemented the MA index on top of an object-relational DBMS. Performance analysis shows that the MA index provides speedups of orders of magnitude for typical OLAP queries.

international conference on information networking | 2012

Design and implementation of web crawler based on dynamic web collection cycle

Kangseok Kim; Kyoung-Ran Kim; Kyung-Hee Lee; T. K. Kim; Wan-Sup Cho

The amount of web information is increasing rapidly with advanced wireless networks and emergence of diverse smart devices like i-Phone, i-Pad and so on. The information is continuously being produced and updated in anywhere and anytime by means of easy web platforms, and social networks. Now, it is becoming a hot issue how frequently updated web data has to be refreshed in data integration and retrieval domain. In this paper, we propose dynamic web-data crawling methods, which include sensitive checking of web site changes, and dynamic retrieving of web pages from target web sites. Furthermore, we implemented a java-based web crawling application and compared performance between conventional static approaches and our proposed dynamic ones. Our experiment results showed 59% performance benefits compared to static crawling method.

database systems for advanced applications | 1997

Query Optimization Techniques Utilizing Path Indexes in Object-Oriented Database Systems

Wan-Sup Cho; Seung-Sun Lee; Kyu-Young Whang; Yong-Ik Yoon

We propose query optimization techniques that fully utilize the advantages of path indexes in object-oriented database systems. Although path indexes provide an eecient access to complex objects, little research has been done on query optimization that fully utilize path indexes. We rst devise a generalized index intersection technique, adapted to the structure of the path index extended from conventional indexes, for utilizing multiple (path) indexes to access each class in a query. We then propose the query graph reduction algorithm that replaces the classes in the query graph with path index scans; we call the resultant query graph reduced query graph (RQG). We nally present the search algorithm that nds the least-cost evaluation plan from RQG based on a cost model. Proposed query optimization techniques enhance database performance by using path indexes instead of direct accesses to data in the evaluating queries.

international conference on ubiquitous and future networks | 2016

Smart answering Chatbot based on OCR and Overgenerating Transformations and Ranking

Ly Pichponreay; Jin-Hyuk Kim; Chi-Hwan Choi; Kyung-Hee Lee; Wan-Sup Cho

With rapid development of information and communication technology, people are very diverse in education, learning style, and knowledge improvement methods. This paper presents an approach of converting documents into knowledge of Chatbot system that enables users to make more benefits of it by asking and answering questions through the use of electronic documents integrated with simulate system. It is an integrated system for enrich contents of documents from popular format such as Portable Document Format (PDF) and digital photos. The workflow of this system is started from extracts texts using Optical Character Recognition (OCR) from files, then generates questions via Overgenerating Transformations and Ranking algorithm, and finally let Chatbot response to the users question when it is matched with the String pattern.

international conference on data mining | 2006

An efficient storage model for the SBML documents using object databases

Seung-Hyun Jung; Tae-Sung Jung; Tae-Kyung Kim; Kyoung-Ran Kim; Jaesoo Yoo; Wan-Sup Cho

As SBML is regarded as a de-facto standard to express the biological network data in systems biology, the amount of the SBML documents is exponentially increasing. We propose an SBML data management system (SMS) on top of an object database. Since the object database supports abundant data types like multi-valued attributes and object references, mapping from the SBML documents into the object database is straightforward. We adopt the event-based SAX parser instead of the DOM parser for dealing with huge SBML documents. Note that DOM parser suffers from excessive memory overhead for the document parsing. For high quality data, SMS supports data cleansing function by using gene ontology. Finally, SMS generates user query results in an SBML format (for data exchange) or in a visual graphs (for intuitive understanding). Real experiments show that our approach is superior to the one using conventional relational databases in the aspects of the modeling capability, storage requirements, and data quality.

Explore More