Ju-Hong Lee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ju-Hong Lee is active.

Explore More

Publication

Featured researches published by Ju-Hong Lee.

international conference on management of data | 1999

Multi-dimensional selectivity estimation using compressed histogram information

Ju-Hong Lee; Deok-Hwan Kim; Chin-Wan Chung

The database query optimizer requires the estimation of the query selectivity to find the most efficient access plan. For queries referencing multiple attributes from the same relation, we need a multi-dimensional selectivity estimation technique when the attributes are dependent each other because the selectivity is determined by the joint data distribution of the attributes. Additionally, for multimedia databases, there are intrinsic requirements for the multi-dimensional selectivity estimation because feature vectors are stored in multi-dimensional indexing trees. In the 1-dimensional case, a histogram is practically the most preferable. In the multi-dimensional case, however, a histogram is not adequate because of high storage overhead and high error rates. In this paper, we propose a novel approach for the multi-dimensional selectivity estimation. Compressed information from a large number of small-sized histogram buckets is maintained using the discrete cosine transform. This enables low error rates and low storage overheads even in high dimensions. In addition, this approach has the advantage of supporting dynamic data updates by eliminating the overhead for periodical reconstructions of the compressed information. Extensive experimental results show advantages of the proposed approach.

international conference on data engineering | 2000

Similarity search for multidimensional data sequences

Seok-Lyong Lee; Seok-Ju Chun; Deok-Hwan Kim; Ju-Hong Lee; Chin-Wan Chung

Time series data, which are a series of one dimensional real numbers, have been studied in various database applications. We extend the traditional similarity search methods on time series data to support a multidimensional data sequence, such as a video stream. We investigate the problem of retrieving similar multidimensional data sequences from a large database. To prune irrelevant sequences in a database, we introduce correct and efficient similarity functions. Both data sequences and query sequences are partitioned into subsequences, and each of them is represented by a Minimum Bounding Rectangle (MBR). The query processing is based upon these MBRs, instead of scanning data elements of entire sequences. Our method is designed: (1) to select candidate sequences in a database, and (2) to find the subsequences of a selected sequence, each of which falls under the given threshold. The latter is of special importance in the case of retrieving subsequences from large and complex sequences such as video. By using it, we do not need to browse the whole of the selected video stream, but just browse the sub-streams to find a scene we want. We have performed an extensive experiment on synthetic, as well as real data sequences (a collection of TV news, dramas, and documentary videos) to evaluate our proposed method. The experiment demonstrates that 73-94 percent of irrelevant sequences are pruned using the proposed method, resulting in 16-28 times faster response time compared with that of the sequential search.

Information Processing and Management | 2009

Automatic generic document summarization based on non-negative matrix factorization

Ju-Hong Lee; Sun Park; Chan-Min Ahn; Daeho Kim

In existing unsupervised methods, Latent Semantic Analysis (LSA) is used for sentence selection. However, the obtained results are less meaningful, because singular vectors are used as the bases for sentence selection from given documents, and singular vector components can have negative values. We propose a new unsupervised method using Non-negative Matrix Factorization (NMF) to select sentences for automatic generic document summarization. The proposed method uses non-negative constraints, which are more similar to the human cognition process. As a result, the method selects more meaningful sentences for generic document summarization than those selected using LSA.

international conference on communications | 2009

Reliability and performance enhancement technique for SSD array storage system using RAID mechanism

Kwanghee Park; Dong-Hwan Lee; Youngjoo Woo; Geun-Hyung Lee; Ju-Hong Lee; Deok-Hwan Kim

Recently solid state drive (SSD) based on NAND flash memory chips becomes popular in the consumer electronics market because it is tough on shock and its I/O performance is better than that of conventional hard disk drive. However, as the density of the semiconductor grows higher, the distance between its wires narrows down, their interferences are frequently occurred, and the bit error rate of semiconductor increases. Such frequent error occurrence and short life cycle in NAND flash memory reduce the reliability of SSD. In this paper, we present reliability and performance enhancement technique on new RAID system based on SSD. First, we analyze the existing RAID mechanism in the environment of SSD array and then develop a new RAID methodology adaptable to SSD array storage system. Via trace-driven simulation, we evaluated the performance of our new optimized SSD array storage using RAID mechanism. The proposed method enhances the reliability of SSD array 2% higher than that of existing RAID system and improves the I/O performance of SSD array 28% higher than that of existing RAID system.

Neurocomputing | 2008

Letters: Solving local minima problem with large number of hidden nodes on two-layered feed-forward artificial neural networks

Bumghi Choi; Ju-Hong Lee; Deok-Hwan Kim

The gradient descent algorithms like backpropagation (BP) or its variations on multi-layered feed-forward networks are widely used in many applications. However, the most serious problem associated with the BP is local minima problem. Especially, an exceeding number of hidden nodes make the corresponding network deepen the local minima problem. We propose an algorithm which shows stable performance on training despite of the large number of hidden nodes. This algorithm is called separate learning algorithm in which hidden-to-output and input-to-hidden separately trained. Simulations on some benchmark problems have been performed to demonstrate the validity of the proposed method.

conference on current trends in theory and practice of informatics | 2007

Multi-document Summarization Based on Cluster Using Non-negative Matrix Factorization

Sun Park; Ju-Hong Lee; Deok-Hwan Kim; Chan-Min Ahn

In this paper, a new summarization method, which uses non-negative matrix factorization (NMF) and K-means clustering, is introduced to extract meaningful sentences from multi-documents. The proposed method can improve the quality of document summaries because the inherent semantics of the documents are well reflected by using the semantic features calculated by NMF and the sentences most relevant to the given topic are extracted efficiently by using the semantic variables derived by NMF. Besides, it uses K-means clustering to remove noises so that it can avoid the biased inherent semantics of the documents to be reflected in summaries. We perform detail experiments with the well-known DUC test dataset. The experimental results demonstrate that the proposed method has better performance than other methods using the LSA, the Kmeans, and the NMF.

Neurocomputing | 2009

Comparison of generalization ability on solving differential equations using backpropagation and reformulated radial basis function networks

Bumghi Choi; Ju-Hong Lee

The gradient descent algorithms like backpropagation (BP) or its variations on multilayered feed-forward networks are widely used in many applications, especially on solving differential equations. Reformulated radial basis function networks (RBFN) are expected to have more accuracy in generalization capability than BP according to the regularization theory. We show how to apply the both networks to a specific example of differential equations and compare the capability of generalization and convergence. The experimental comparison of various approaches clarifies that reformulated RBFN shows better performance than BP for solving a specific example of differential equations.

Information Processing Letters | 1999

A model for k -nearest neighbor query processing cost in multidimensional data space

Ju-Hong Lee; Guang-Ho Cha; Chin-Wan Chung

A cost model for the performance of the k-nearest neighbor query in multidimensional data space is presented. Two concepts, the regional average volume and the density function, are introduced to predict the performance for uniform and non-uniform data distributions. The experiment shows that the prediction based on this model is accurate within an acceptable range of the error in low and mid dimensions.

rough sets and knowledge technology | 2009

Forecasting Change Directions for Financial Time Series Using Hidden Markov Model

Sang Ho Park; Ju-Hong Lee; Jae-Won Song; Tae-Su Park

Financial time series, i.e. stock prices, has the property of being noisy, volatile and non-stationary. It causes the uncertainty in the forecasting of the financial time series. To overcome this difficulty, we propose a new method that forecasts change direction (up ordown ) of next days closing price of financial time series using the continuous HMM. It classifies sliding windowed stock prices to two categories (up ordown ) by their next days price change directions, and then trains two HMMs for two categories. Experiments showed that our method forecasts the change directions of financial time series having dynamic characteristics effectively.

2008 IEEE International Workshop on Semantic Computing and Applications | 2008

Automatic Personalized Summarization Using Non-negative Matrix Factorization and Relevance Measure

Sun Park; Ju-Hong Lee; Jae-Won Song

In this paper, a new automatic personalized summarization method, which uses non-negative matrix factorization (NMF) and relevance measure (RM), is introduced to extract meaningful sentences from to retrieve documents in Internet. The proposed method can improve the quality of personalized summaries because the inherent semantics of the documents are well reflected by using the semantic features calculated by NMF and the sentences most relevant to the given query are extracted efficiently by using the semantic variables derived by NMF. Besides, it uses RM to summarize generic summary so that it can select sentences covering the major topics of the document. The experimental results using Yahoo-Korea News data show that the proposed method achieves better performance than the other methods.

Explore More