Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Beomseok Nam is active.

Publication


Featured researches published by Beomseok Nam.


grid computing | 2006

Resource Discovery Techniques in Distributed Desktop Grid Environments

Jik-Soo Kim; Beomseok Nam; Peter J. Keleher; Michael A. Marsh; Bobby Bhattacharjee; Alan Sussman

Desktop grids use opportunistic sharing to exploit large collections of personal computers and workstations across the Internet, achieving tremendous computing power at low cost. Traditional desktop grid systems are typically based on a client-server architecture, which has inherent shortcomings with respect to robustness, reliability and scalability. In this paper, we propose a decentralized, robust, highly available, and scalable infrastructure to match incoming jobs to available resources. Through a comparative analysis on the experimental results obtained via simulation of three different types of matchmaking algorithms under different workload scenarios, we show the trade-offs between effcient matchmaking and good load balancing in a fully decentralized, heterogeneous computational environment.


international parallel and distributed processing symposium | 2007

Creating a Robust Desktop Grid using Peer-to-Peer Services

Jik-Soo Kim; Beomseok Nam; Michael A. Marsh; Peter J. Keleher; Bobby Bhattacharjee; Derek C. Richardson; Dennis D. Wellnitz; Alan Sussman

The goal of the work described in this paper is to design and build a scalable infrastructure for executing grid applications on a widely distributed set of resources. Such grid infrastructure must be decentralized, robust, highly available, and scalable, while efficiently mapping application instances to available resources in the system. However, current desktop grid computing platforms are typically based on a client-server architecture, which has inherent shortcomings with respect to robustness, reliability and scalability. Fortunately, these problems can be addressed through the capabilities promised by new techniques and approaches in peer-to-peer (P2P) systems. By employing P2P services, our system allows users to submit jobs to be run in the system and to run jobs submitted by other users on any resources available in the system, essentially allowing a group of users to form an ad-hoc set of shared resources. The initial target application areas for the desktop grid system are in astronomy and space science simulation and data analysis.


Journal of Parallel and Distributed Computing | 2010

Multiple query scheduling for distributed semantic caches

Beomseok Nam; Minho Shin; Henrique Andrade; Alan Sussman

In distributed query processing systems, load balancing plays an important role in maximizing system throughput. When queries can leverage cached intermediate results, improving the cache hit ratio becomes as important as load balancing in query scheduling, especially when dealing with computationally expensive queries. The scheduling policies must be designed to take into consideration the dynamic contents of the distributed caching infrastructure. In this paper, we propose and discuss several distributed query scheduling policies that directly consider the available cache contents by employing distributed multidimensional indexing structures and an exponential moving average approach to predicting cache contents. These approaches are shown to produce better query plans and faster query response times than traditional scheduling policies that do not predict dynamic contents in distributed caches. We experimentally demonstrate the utility of the scheduling policies using MQO, which is a distributed, Grid-enabled, multiple query processing middleware system we developed to optimize query processing for data analysis and visualization applications.


cluster computing and the grid | 2005

Spatial indexing of distributed multidimensional datasets

Beomseok Nam; Alan Sussman

While declustering methods for distributed multidimensional indexing of large datasets have been researched widely in the past, replication techniques for multidimensional indexes have not been investigated deeply. In general, a centralized index server may become the performance bottleneck in a wide area network rather than the data servers, since the index is likely to be accessed more often than any of the datasets in the servers. In this paper, we present two different multidimensional indexing algorithms for a distributed environment - a centralized global index and a two-level hierarchical index. Our experimental results show that the centralized scheme does not scale well for either insertion or searching the index. In order to improve the scalability of the index server, we have employed a replication protocol for both the centralized and two-level index schemes that allows some inconsistency between replicas without affecting correctness. Our experiments show that the two-level hierarchical index scheme shows better scalability for both building and searching the index than the non-replicated centralized index, but replication can make the centralized index faster than the two-level hierarchical index for searching in some cases.


statistical and scientific database management | 2004

A comparative study of spatial indexing techniques for multidimensional scientific datasets

Beomseok Nam; Alan Sussman

Scientific applications that query into very large multidimensional datasets are becoming more common. These datasets are growing in size every day, and are becoming truly enormous, making it infeasible to index individual data elements. We have instead been experimenting with chunking the datasets to index them, grouping data elements into small chunks of a fixed, but dataset-specific, size to take advantage of spatial locality. While spatial indexing structures based on R-trees perform reasonably well for the rectangular bounding boxes of such chunked datasets, other indexing structures based on KDB-trees, such as Hybrid trees, have been shown to perform very well for point data. In this paper, we investigate how all these indexing structures perform for multidimensional scientific datasets, and compare their features and performance with that of SH-trees, an extension of Hybrid trees, for indexing multidimensional rectangles. Our experimental results show that the algorithms for building and searching SH-trees outperform those for R-trees, R*-trees, and X-trees for both real application and synthetic datasets and queries. We show that the SH-tree algorithms perform well for both low and high dimensional data, and that they scale well to high dimensions both for building and searching the trees.


international parallel and distributed processing symposium | 2006

DiST: fully decentralized indexing for querying distributed multidimensional datasets

Beomseok Nam; Alan Sussman

Grid computing and peer-to-peer (P2P) systems are emerging as new paradigms for managing large scale distributed resources across wide area networks. While grid computing focuses on managing heterogeneous resources and relies on centralized managers for resource and data discovery, P2P systems target scalable, decentralized methods for publishing and searching for data. In large distributed systems, a centralized resource manager is a potential performance bottleneck and decentralization can help avoid this bottleneck, as is done in P2P systems. However, the query functionality provided by most existing P2P systems is very rudimentary, and is not directly applicable to grid resource management. In this paper, we propose a fully decentralized multidimensional indexing structure, called DiST, that operates in a fully distributed environment with no centralized control. In DiST, each data server only acquires information about data on other servers from executing and routing queries. We describe the DiST algorithms for maintaining the decentralized network of data servers, including adding and deleting servers, the query routing algorithm, and failure recovery algorithms. We also evaluate the performance of the decentralized scheme against a more structured hierarchical indexing scheme that we have previously shown to perform well in distributed grid environments


cluster computing and the grid | 2003

Improving access to multi-dimensional self-describing scientific datasets

Beomseok Nam; Alan Sussman

Applications that query into very large multidimensional datasets are becoming more common. Many self-describing scientific data file formats have also emerged, which have structural metadata to help navigate the multi-dimensional arrays that are stored in the files. The files may also contain application-specific semantic metadata. In this paper, we discuss efficient methods for performing searches for subsets of multi-dimensional data objects, using semantic information to build multidimensional indexes, and group data items into properly sized chunks to maximize disk I/O bandwidth. This work is the first step in the design and implementation of a generic indexing library that will work with various high-dimension scientific data file formats containing semantic information about the stored data. To validate the approach, we have implemented indexing structures for NASA remote sensing data stored in the HDF format with a specific schema (HDF-EOS), and show the performance improvements that are gained from indexing the datasets, compared to using the existing HDF library for accessing the data.


Future Generation Computer Systems | 2008

Trade-offs in matching jobs and balancing load for distributed desktop grids

Jik-Soo Kim; Beomseok Nam; Peter J. Keleher; Michael A. Marsh; Bobby Bhattacharjee; Alan Sussman

Desktop grids can achieve tremendous computing power at low cost through opportunistic sharing of resources. However, traditional client-server Grid architectures do not deal with all types of failures, and do not always cope well with very dynamic environments. This paper describes the design of a desktop grid implemented over a modified Peer-to-Peer (P2P) architecture. The underlying P2P system is decentralized and inherently adaptable, giving the Grid robustness, scalability, and the ability to cope with dynamic environments, while still efficiently mapping application instances to available resources throughout the system. We use simulation to compare three different types of matching algorithms under differing workloads. Overall, the P2P approach produces significantly lower wait times than prior approaches, while adapting efficiently to the dynamic environment.


conference on high performance computing (supercomputing) | 2006

Multiple range query optimization with distributed cache indexing

Beomseok Nam; Henrique Andrade; Alan Sussman

MQO is a distributed multiple query processing middleware that can use resources available on the grid to optimize query processing for data analysis and visualization applications. It does so by introducing one or more proxies that act as front-ends to a collection of backend servers. The basic idea behind this architecture is active semantic caching, whereby queries can leverage available cached results in the proxy either directly or through transformations. While this approach has been shown to speed up query evaluation under multi-client workloads, the caching infrastructure in the backend servers is not used well for query processing. Because this collective caching infrastructure scales with the number of servers, it is an important asset. In this paper, we describe a distributed multidimensional indexing scheme that enables the proxy to directly consider the cache contents available at the backend servers for query planning and scheduling. This approach is shown to produce better query plans and faster query response times as we experimentally demonstrate


grid computing | 2008

Integrating categorical resource types into a P2P desktop grid system

Jik-Soo Kim; Beomseok Nam; Michael A. Marsh; Peter J. Keleher; Bobby Bhattacharjee; Alan Sussman

We describe and evaluate a set of protocols that implement a distributed, decentralized desktop grid. Incoming jobs are matched with system nodes through proximity in an N-dimensional resource space. This work improves on prior work by (1) efficiently accommodating node and job characterizations that include both continuous and categorical resource types, and (2) scaling gracefully to large system sizes even with highly non-uniform distributions of job and node types. We use extensive simulation results to show that the resulting system handles both continuous and categorical constraints efficiently, and that the new scalability techniques are effective.

Collaboration


Dive into the Beomseok Nam's collaboration.

Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge