Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Saehoon Kim is active.

Publication


Featured researches published by Saehoon Kim.


international acm sigir conference on research and development in information retrieval | 2014

Predictive parallelization: taming tail latencies in web search

Myeongjae Jeon; Saehoon Kim; Seung-won Hwang; Yuxiong He; Sameh Elnikety; Alan L. Cox; Scott Rixner

Web search engines are optimized to reduce the high-percentile response time to consistently provide fast responses to almost all user queries. This is a challenging task because the query workload exhibits large variability, consisting of many short-running queries and a few long-running queries that significantly impact the high-percentile response time. With modern multicore servers, parallelizing the processing of an individual query is a promising solution to reduce query execution time, but it gives limited benefits compared to sequential execution since most queries see little or no speedup when parallelized. The root of this problem is that short-running queries, which dominate the workload, do not benefit from parallelization. They incur a large parallelization overhead, taking scarce resources from long-running queries. On the other hand, parallelization substantially reduces the execution time of long-running queries with low overhead and high parallelization efficiency. Motivated by these observations, we propose a predictive parallelization framework with two parts: (1) predicting long-running queries, and (2) selectively parallelizing them. For the first part, prediction should be accurate and efficient. For accuracy, we study a comprehensive feature set covering both term features (reflecting dynamic pruning efficiency) and query features (reflecting query complexity). For efficiency, to keep overhead low, we avoid expensive features that have excessive requirements such as large memory footprints. For the second part, we use the predicted query execution time to parallelize long-running queries and process short-running queries sequentially. We implement and evaluate the predictive parallelization framework in Microsoft Bing search. Our measurements show that under moderate to heavy load, the predictive strategy reduces the 99th-percentile response time by 50% (from 200 ms to 100 ms) compared with prior approaches that parallelize all queries.


european conference on computer vision | 2012

Sequential spectral learning to hash with multiple representations

Saehoon Kim; Yoonseop Kang; Seungjin Choi

Learning to hash involves learning hash functions from a set of images for embedding high-dimensional visual descriptors into a similarity-preserving low-dimensional Hamming space. Most of existing methods resort to a single representation of images, that is, only one type of visual descriptors is used to learn a hash function to assign binary codes to images. However, images are often described by multiple different visual descriptors (such as SIFT, GIST, HOG), so it is desirable to incorporate these multiple representations into learning a hash function, leading to multi-view hashing. In this paper we present a sequential spectral learning approach to multi-view hashing where a hash function is sequentially determined by solving the successive maximization of local variances subject to decorrelation constraints. We compute multi-view local variances by α-averaging view-specific distance matrices such that the best averaged distance matrix is determined by minimizing its α-divergence from view-specific distance matrices. We also present a scalable implementation, exploiting a fast approximate k-NN graph construction method, in which α-averaged distances computed in small partitions determined by recursive spectral bisection are gradually merged in conquer steps until whole examples are used. Numerical experiments on Caltech-256, CIFAR-20, and NUS-WIDE datasets confirm the high performance of our method, in comparison to single-view spectral hashing as well as existing multi-view hashing methods.


international conference on data mining | 2012

Deep Learning to Hash with Multiple Representations

Yoonseop Kang; Saehoon Kim; Seungjin Choi

Hashing seeks an embedding of high-dimensional objects into a similarity-preserving low-dimensional Hamming space such that similar objects are indexed by binary codes with small Hamming distances. A variety of hashing methods have been developed, but most of them resort to a single view (representation) of data. However, objects are often described by multiple representations. For instance, images are described by a few different visual descriptors (such as SIFT, GIST, and HOG), so it is desirable to incorporate multiple representations into hashing, leading to multi-view hashing. In this paper we present a deep network for multi-view hashing, referred to as deep multi-view hashing, where each layer of hidden nodes is composed of view-specific and shared hidden nodes, in order to learn individual and shared hidden spaces from multiple views of data. Numerical experiments on image datasets demonstrate the useful behavior of our deep multi-view hashing (DMVH), compared to recently-proposed multi-modal deep network as well as existing shallow models of hashing.


web search and data mining | 2015

Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search

Saehoon Kim; Yuxiong He; Seung-wong Hwang; Sameh Elnikety; Seungjin Choi

A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel, otherwise it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99th-percentile), which we call extreme tail latency. We propose a prediction framework to reduce the extreme tail latency of search servers. The framework has a unique set of characteristics to predict long-running queries with high recall and improved precision. Specifically, prediction is delayed by a short duration to allow many short-running queries to complete without parallelization, and to allow the predictor to collect a set of dynamic features using runtime information. These features estimate query execution time with high accuracy. We also use them to estimate the prediction errors to override an uncertain prediction by selectively accelerating the query for a higher recall. We evaluate the proposed prediction framework to improve search engine performance in two scenarios using a simulation study: (1) query parallelization on a multicore processor, and (2) query scheduling on a heterogeneous processor. The results show that, for both scenarios, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.


international conference on acoustics, speech, and signal processing | 2013

Multi-view anchor graph hashing

Saehoon Kim; Seungjin Choi

Multi-view hashing seeks compact integrated binary codes which preserve similarities averaged over multiple representations of objects. Most of existing multi-view hashing methods resort to linear hash functions where data manifold is not considered. In this paper we present multi-view anchor graph hashing (MVAGH), where nonlinear integrated binary codes are efficiently determined by a subset of eigenvectors of an averaged similarity matrix. The efficiency behind MVAGH is due to a low-rank form of the averaged similarity matrix induced by multi-view anchor graph, where the similarity between two points is measured by two-step transition probability through view-specific anchor (i.e. landmark) points. In addition, we observe that MVAGH suffers from the performance degradation when the high recall is required. To overcome this drawback, we propose a simple heuristic to combine MVAGH with locality sensitive hashing (LSH). Numerical experiments on CIFAR-10 dataset confirms that MVAGH(+LSH) outperforms the existing multi- and single-view hashing methods.


computer vision and pattern recognition | 2015

Bilinear random projections for locality-sensitive binary codes

Saehoon Kim; Seungjin Choi

Locality-sensitive hashing (LSH) is a popular data-independent indexing method for approximate similarity search, where random projections followed by quantization hash the points from the database so as to ensure that the probability of collision is much higher for objects that are close to each other than for those that are far apart. Most of high-dimensional visual descriptors for images exhibit a natural matrix structure. When visual descriptors are represented by high-dimensional feature vectors and long binary codes are assigned, a random projection matrix requires expensive complexities in both space and time. In this paper we analyze a bilinear random projection method where feature matrices are transformed to binary codes by two smaller random projection matrices. We base our theoretical analysis on extending Raginsky and Lazebniks result where random Fourier features are composed with random binary quantizers to form locality sensitive binary codes. To this end, we answer the following two questions: (1) whether a bilinear random projection also yields similarity-preserving binary codes; (2) whether a bilinear random projection yields performance gain or loss, compared to a large linear projection. Regarding the first question, we present upper and lower bounds on the expected Hamming distance between binary codes produced by bilinear random projections. In regards to the second question, we analyze the upper and lower bounds on covariance between two bits of binary codes, showing that the correlation between two bits is small. Numerical experiments on MNIST and Flickr45K datasets confirm the validity of our method.


ACM Transactions on The Web | 2016

Prediction and Predictability for Search Query Acceleration

Seung-won Hwang; Saehoon Kim; Yuxiong He; Sameh Elnikety; Seungjin Choi

A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency, or high-percentile response time, of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel; otherwise, it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99th percentile), which we call extreme tail latency. To address tighter requirements of extreme tail latency, we propose a new design space for the problem, subsuming existing work and also proposing a new solution space. Existing work makes a prediction using features available at indexing time and focuses on optimizing prediction features for accelerating tail queries. In contrast, we identify “when to predict?” as another key optimization question. This opens up a new solution of delaying a prediction by a short duration to allow many short-running queries to complete without parallelization and, at the same time, to allow the predictor to collect a set of dynamic features using runtime information. This new question expands a solution space in two meaningful ways. First, we see a significant reduction of tail latency by leveraging “dynamic” features collected at runtime that estimate query execution time with higher accuracy. Second, we can ask whether to override prediction when the “predictability” is low. We show that considering predictability accelerates the query by achieving a higher recall. With this prediction, we propose to accelerate the queries that are predicted to be long-running. In our preliminary work, we focused on parallelization as an acceleration scenario. We extend to consider heterogeneous multicore hardware for acceleration. This hardware combines processor cores with different microarchitectures such as energy-efficient little cores and high-performance big cores, and accelerating web search using this hardware has remained an open problem. We evaluate the proposed prediction framework in two scenarios: (1) query parallelization on a multicore processor and (2) query scheduling on a heterogeneous processor. Our extensive evaluation results show that, for both scenarios of query acceleration using parallelization and heterogeneous cores, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.


workshop on applications of computer vision | 2015

Near Duplicate Image Discovery on One Billion Images

Saehoon Kim; Xin-Jing Wang; Lei Zhang; Seungjin Choi

Near-duplicate image discovery is the task of detecting all clusters of images which duplicate at a significant region. Previous work generally take divide and conquer approaches composed of two steps: generating cluster seeds using min-hashing, and growing the seeds by searching the entire image space with the seeds as queries. Since the computational complexity of the seed growing step is generally O (NL) where N and L are the number of images and seeds respectively, existing work can hardly be scaled up to a billion-scale dataset because L is typically millions. In this paper, we study a feasible solution of near-duplicate image discovery on one billion images, which is easily implemented on MapReduce framework. The major contribution of this work is to introduce the seed growing step designed to efficiently reduce the number of false positives among cluster seeds with O (cNL) time complexity, where c is small enough for a billion-scale dataset. The basis component of the seed growing step is a bottom-k min-hash, which generates different signatures in a sketch to remove all candidate images that share only one common visual word with a cluster seed. Our evaluations suggest that the proposed method can discover near-duplicate clusters with high precision and recall, and represent some interesting properties of our 1 billion dataset.


international workshop on machine learning for signal processing | 2010

Local dimensionality reduction for multiple instance learning

Saehoon Kim; Seungjin Choi

Multiple instance learning involves labeling bags (sets of instances) rather than individual instances. Positive bags contain both true positive and false positive instances, leading to label ambiguity, while negative bags consist of only true negative instances. Since labels for individual instances are not known, a direct application of existing discriminant analysis or dimensionality reduction methods often yields an undesirable projection direction due to this label ambiguity in positive bags. In this paper we present a citation local Fisher discriminant analysis (CLFDA) where we incorporate both citation and reference information into local Fisher discriminant analysis, in order to detect false positive instances whose corresponding labels are corrected to be negative. To our best knowledge, CLFDA is the first attempt in supervised dimensionality reduction for multiple instance learning. Numerical experiments on several benchmark datasets confirm that CLFDA outperforms existing methods in the task of multiple instance learning.


international conference on data mining | 2012

Hashing with Generalized Nyström Approximation

Jeong-Min Yun; Saehoon Kim; Seungjin Choi

Hashing, which involves learning binary codes to embed high-dimensional data into a similarity-preserving low-dimensional Hamming space, is often formulated as linear dimensionality reduction followed by binary quantization. Linear dimensionality reduction, based on maximum variance formulation, requires leading eigenvectors of data covariance or graph Laplacian matrix. Computing leading singular vectors or eigenvectors in the case of high-dimension and large sample size, is a main bottleneck in most of data-driven hashing methods. In this paper we address the use of generalized Nystrom method where a subset of rows and columns are used to approximately compute leading singular vectors of the data matrix, in order to improve the scalability of hashing methods in the case of high-dimensional data with large sample size. Especially we validate the useful behavior of generalized Nystrom approximation with uniform sampling, in the case of a recently-developed hashing method based on principal component analysis (PCA) followed by an iterative quantization, referred to as PCA+ITQ, developed by Gong and Lazebnik. We compare the performance of generalized Nystrom approximation with uniform and non-uniform sampling, to the full singular value decomposition (SVD) method, confirming that the uniform sampling improves the computational and space complexities dramatically, while the performance is not much sacrificed. In addition we present low-rank approximation error bounds for generalized Nystrom approximation with uniform sampling, which is not a trivial extension of available results on the non-uniform sampling case.

Collaboration


Dive into the Saehoon Kim's collaboration.

Top Co-Authors

Avatar

Seungjin Choi

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Juho Lee

University of Oxford

View shared research outputs
Top Co-Authors

Avatar

Hae Beom Lee

Ulsan National Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yoonseop Kang

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Bohyung Han

Pohang University of Science and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge