Hong-Soog Kim
Information and Communications University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hong-Soog Kim.
bioinformatics and bioengineering | 2004
Dongsoo Han; Hong-Soog Kim; Woo-Hyuk Jang; Sungdoke Lee
With the accumulation of protein and its related data on the Internet, many domain based computational techniques to predict protein interactions have been developed. However most of the techniques still have many limitations to be used in real fields. They usually suffer from low accuracy problem in prediction and do not provide any interaction possibility ranking method for multiple protein pairs. In this paper, we reevaluate a domain combination based protein interaction prediction method and develop an interaction possibility ranking method for multiple protein pairs. Using the ranking method, one can discern which protein pair is more probable to interact with each other than other protein pairs in multiple protein pairs. In the reevaluation, we have found that the accuracy of the prediction is improved as the size of non-interacting set of protein pairs is increased. When the size of non-interacting set of protein pairs is increased to 20 times bigger than that of interacting set of protein pairs in learning sets, 84% sensitivity and 75% specificity were achieved in yeast organism. In the validation of the ranking method, we revealed that there exist some correlations between the interacting probability and the accuracy of the prediction in case of the protein pair group having the matching PIP values in the interacting or non-interacting PIP distributions.
international conference on computational science | 2003
Hong-Soog Kim; Hae-Jin Kim; Dongsoo Han
BLAST is an important tool in bioinformatics. It has been used to find biologically similar sequences to the given query sequence from the database of the annotated sequences. For high throughput processing of huge number of query sequences, there have been many studies on parallel batch processing of sequence similarity search using BLAST. As the number of sequences in the database increases at exponential rate, the search speed of BLAST itself becomes important. Although NCBI has developed a parallel BLAST using the thread on SMP machines for the speedup of BLAST, the speedup is still limited because the SMP machine has restricted the number of processors due to its architecture. In this paper, we present our parallelized BLAST on cluster systems for further speedup. The main strategy used is the exploitation of the inter-node parallelism, which can be extracted by logical partitioning of the database. For the inter-node parallelism, we have designed and implemented a logical database partitioning method, initiation and coordination of the BLAST on remote node and communication protocol for collecting remote nodes result. According to our performance test with 2-way 8 node cluster system, roughly 12 times speedup has been achieved in terms of response time of similarity search for individual query sequence.
ieee international conference on high performance computing data and analytics | 2006
Hong-Soog Kim; Hae-Jin Kim; Dongsoo Han
BLAST is a tool for finding biologically similar sequences to given query sequences in annotated sequence database. Since the number of sequences in the database increases at exponential rate, and the number of users drastically increases, the performance of BLAST is a primary concern to service sites like NCBI. NCBI developed a parallel BLAST for the speedup of BLAST using threads on SMP machines. But the speedup is still limited due to the architectural limitations of SMP machines. Distributed memory multiprocessor can be an alternative choice for cost-effective search in very large scale sequence data. However for an optimized configuration of Cluster systems and SMP machines, the performance study of BLAST on SMP machines is essential. In this paper, we analyze BLAST and BLAST algorithms to enhance the performance of BLAST on parallel machines and report the performance of BLAST on SMP machines. Some important runtime characteristics of BLAST are identified through the performance evaluation. According to our performance test, PC clusters or clusters of low-way SMP machines outperform high-way SMP machines in terms of cost-effectiveness. Besides, BLAST on Linux operating system shows better performance than BLAST on Solaris operating system in the same configurations.
The Journal of Supercomputing | 2004
Hong-Soog Kim; Youngha Yoon; Dongsoo Han
In this paper, we propose a new algorithm that analyzes the data dependency pattern in the first-order linear recurrence (FOLR) and transforms it into algebraically equivalent expanded form so that it can be processed in parallel using the threads on symmetric multiprocessor (SMP) machines. The transformation aims to eliminate the data dependencies in the naive nested form of the FOLR. However, as this transformation may result in extra multiplication operations, our algorithm examines the immanent overhead of the expanded form of the FOLR and generates a new hybrid form of the FOLR. The hybrid form combines nested and appropriately expanded form in order to make it suitable for parallel processing. The parallel algorithm based on the hybrid form of the FOLR is analytically examined and tested through implementation on SMP machines. The implementation details, such as the workload balancing between processors and the optimization of cache performance, are also discussed. The experimental results show that the parallel algorithm based on the hybrid form of the FOLR considerably improves the performance on SMP machines that have three of more processors.
advanced information networking and applications | 2008
Hong-Soog Kim; Woo-Hyuk Jang; Dongsoo Han
With the widespread use of BLAST, many parallel versions of BLAST on cluster systems are announced, but little work has been done for the parallel execution in the search for individual query sequence on BLAST on cluster systems. Since we can improve not only throughput but also response time, the techniques for parallel execution of BLAST on cluster systems in the search for individual query sequence deserve to be developed. This paper develops communication protocols and message formats to reduce the communication overheads for the parallel execution of BLAST in the search for individual query sequence on cluster systems. The developed communication protocols and message formats are implemented on a new version of BLAST on cluster systems. The new version of BLAST is named Hyper-BLAST in this paper. In this paper, we also measured the throughput and response time of Hyper-BLAST on various cluster systems. It turned out that considerable performance improvement of BLAST on cluster systems can be achieved through parallel execution in the search for individual query sequence on small or middle-sized cluster systems. On 1-way 64-node system, Hyper-BLAST achieved scalable speedup up to 63 processors for 1000-5000 length query size.
databases in networked information systems | 2002
Jaeyong Shim; Dongsoo Han; Hong-Soog Kim
As the needs for interconnections of processes in different companies or departments are so increasing and companies try to realize business processes across organizational boundaries, the correctness issues of inter-organizational workflow definition is getting more important. In this paper, we develop community process definition language (CPDL) for inter-organizational workflow specification. It is devised to analyze correctness of inter-organizational workflow definition and especially it is used to detect latent communication deadlocks. A new communication deadlock detection technique in the context of interorganizational workflow definition is developed on CPDL using the set based constraint system. Any inter-organizational workflow languages that can be translated into CPDL can detect its communication deadlock using the technique of this paper.
ieee international conference on high performance computing data and analytics | 2000
Hong-Soog Kim; Youngha Yoon; Sang-Og Na; Dongsoo Han
We introduce ICU-PFC: an automatic parallelizing compiler. It receives FORTRAN source code and generates parallel FORTRAN code where OpenMP directives for parallel execution are inserted. A research compiler is developed to test automatic parallelizing techniques in the SMP environment. ICU-PFC detects DO ALL parallel loops and inserts appropriate OpenMP directives. For parallel loop detection, we designed and implemented a dependence matrix which is used for storing data dependence information of statements in a loop. In experimental results ICCT-PFC generated code showed better performance than sequential code and even manually parallelized code.
Nucleic Acids Research | 2004
Dongsoo Han; Hong-Soog Kim; Woo-Hyuk Jang; Sungdoke Lee; Jung-Keun Suh
KIISE Transactions on Computing Practices | 2003
Dongsoo Han; Hong-Soog Kim; Jungmin Seo; Woo-Hyuk Jang
Genome Informatics | 2004
Dongsoo Han; Hong-Soog Kim; Woo-Hyuk Jang; Sung-Doke Lee; Jung Keun Suh