Jinbeom Kang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jinbeom Kang is active.

Explore More

Publication

Featured researches published by Jinbeom Kang.

IEEE Transactions on Consumer Electronics | 2010

Repetition-based web page segmentation by detecting tag patterns for small-screen devices

Jinbeom Kang; Jaeyoung Yang; Joongmin Choi

Web page segmentation into logical blocks is an important preprocessing step for recognizing informative content blocks in a page that leads to efficient information extraction and convenient display on the devices with small-sized screens. Previous methods for Web page segmentation are not flexible in a dynamic Web environment because they largely relied on heuristic rules generated by exploiting structural tags and visual information inherent in a page. To resolve this problem, this paper proposes a new method of Web page segmentation by recognizing repetitive tag patterns called key patterns in the DOM tree structure of a page. We report on the Repetition-based Page Segmentation (REPS) algorithm, which detects key patterns in a page and generates virtual nodes to correctly segment nested blocks. A series of experiments performed for real Web sites showed that REPS greatly contributes to improving the correctness of Web page segmentation.

web intelligence | 2006

Topic-Specific Web Content Adaptation to Mobile Devices

Eunshil Lee; Jinbeom Kang; Joongmin Choi; Jaeyoung Yang

Mobile content adaptation is a technology of effectively representing the contents originally built for the desktop PC on wireless mobile devices. Previous approaches for Web content adaptation are mostly device-dependent. Also, the content transformation to suit to a smaller device is done manually. As a result, the user has difficulty in selecting relevant information from a heavy volume of contents since the context information related to the content is not provided. To resolve these problems, this paper proposes an enhanced method of Web content adaptation for mobile devices. In our system, the process of Web content adaptation consists of 4 stages including block filtering, block title extraction, block content summarization, and personalization through learning. As a result of learning, personalization is realized by showing the information for the relevant block at the top of the content list

international symposium on information technology convergence | 2007

Detecting Informative Web Page Blocks for Efficient Information Extraction Using Visual Block Segmentation

Jinbeom Kang; Joongmin Choi

As the structure of a Web page is getting more complicated, the construction of wrapper induction rules becomes more difficult and time-consuming. The main problem in most wrapper induction methods is the difficulty in discriminating the meaningful blocks that contain the target information from the noise blocks that contains irrelevant information such as advertisements, menus, or copyright statements. To solve this problem, this paper proposes the RIPB(recognizing informative page blocks) algorithm that detects the informative blocks in a Web page by exploiting the visual block segmentation scheme. RIPB uses the visual page segmentation algorithm to analyze and partition a Web page into a set of logical blocks, and then groups related blocks with similar structures into a block cluster and recognizes the informative block clusters by applying some heuristic rules to the cluster information. The results of a series of experiments indicate that RIPB contributes to improve the accuracy of information extraction by allowing the wrapper induction module to focus only on the informative block information and ignore other noise information in building extraction rules.

international conference on information science and applications | 2011

An Ontology-Based Recommendation System Using Long-Term and Short-Term Preferences

Jinbeom Kang; Joongmin Choi

Personalized information retrieval and recommendation systems have been proposed to deliver the right information to users with different interests. However, most of previous systems are using keyword frequencies as the main factor for personalization, and as a result, they could not analyze semantic relations between words. Also, previous methods often fail to provide the documents that are related semantically with the query words. To solve these problems, we propose a recommendation system which provides relevant documents to users by identifying semantic relations between an ontology that semantically represents the documents crawled by a Web robot and user behavior history. Recommendation is mainly based on content-based similarity, semantic similarity, and preference weights.

networked computing and advanced information management | 2008

Block Classification of a Web Page by Using a Combination of Multiple Classifiers

Jinbeom Kang; Joongmin Choi

Recently, researchers have been actively studying on Web mining with various data in the World Wide Web. Since Web pages are generally semi-structured, which makes it difficult to identify informative blocks, techniques of content detection by removing unnecessary data (e.g. advertisements) from the Web pages become important. Generally a Web page consists of many blocks containing various data and structural information. In this paper, we propose a method that classifies the blocks of a Web page into an appropriate category by building a Tree Alignment model representing HTML structure and a Vector model representing the features of the blocks. Web sites normally have their own templates and the blocks may be related to different categories even though they are located in the same position in the Web browser or are structurally similar. Hence it is difficult to classify the blocks into accurate categories through building one classifier. To solve the problem, in our approach, multiple classifiers are built, one for each training domain, and the block classification proceeds through combining them.

multimedia and ubiquitous engineering | 2007

ScalableWeb News Adaptation To Mobile Devices Using Visual Block Segmentation for Ubiquitous Media Services

Eunshil Lee; Jinbeom Kang; Jeahyun Park; Joongmin Choi; Jaeyoung Yang

This paper describes an enhanced method of Web content adaptation to mobile devices for online News article provision in ubiquitous environments. Our system exploits a scheme of visual block segmentation for Web pages that filters out unnecessary blocks and extracts useful article information from content blocks. This method resolves the problems of previous approaches to Web content adaptation in which the content transformation to suit to a smaller device is device-dependent and manually-driven. Our method also employs a learning module that is initiated when the user selects to view the full content in the content summary page. As a result of learning, personalization is realized by showing the information for the relevant block at the top of the content list. A series of experiments are performed to evaluate our mobile content adaptation method for a number of well-known Web News sites, and the result of evaluation is satisfactory both in block filtering accuracy and in user satisfaction by personalization.

intelligent data engineering and automated learning | 2005

A focused crawler with document segmentation

Jaeyoung Yang; Jinbeom Kang; Joongmin Choi

The focused crawler is a topic-driven document-collecting crawler that was suggested as a promising alternative of maintaining up-to-date Web document indices in search engines. A major problem inherent in previous focused crawlers is the liability of missing highly relevant documents that are linked from off-topic documents. This problem mainly originated from the lack of consideration of structural information in a document. Traditional weighting method such as TFIDF employed in document classification can lead to this problem. In order to improve the performance of focused crawlers, this paper proposes a scheme of locality-based document segmentation to determine the relevance of a document to a specific topic. We segment a document into a set of sub-documents using contextual features around the hyperlinks. This information is used to determine whether the crawler would fetch the documents that are linked from hyperlinks in an off-topic document.

networked computing and advanced information management | 2008

Detecting Collaborative Fields Using Social Networks

Dongwook Shin; Jinbeom Kang; Joongmin Choi; Jaeyoung Yang

It is generally difficult for researchers to obtain information related to their own fields and novel technologies from huge data residing in the World Wide Web. Furthermore, they often try to apply them to other particular fields which are different from theirs. The main motivation of this phenomenon is to solve existing problems or improve the performance of their systems. Hence, it is important to detect collaborative fields in which technologies of particular fields are applied to another area to find various trends. In this paper, we propose a method to detect collaborative fields by using social networks representing the relations among authors of papers, and describe some experimental results to show the effectiveness of the proposed method when collaborative fields are detected by using social networks.

international conference on intelligent computing | 2007

Extraction of user-defined data blocks using the regularity of dynamic web pages

Cheolhee Choi; Jinbeom Kang; Joongmin Choi

This paper proposes an enhanced method of Web information extraction by exploiting general phenomena that Web pages in a site tend to have common structures and dynamic Web pages contain multiple data blocks with repeating structural patterns. By considering this kind of regularity in dynamic Web pages, we develop a data block extraction system which basically adopts a supervised learning mechanism with training and extraction phases. In the training phase, the user selects and specifies a data block and the extraction rules for the block are generated. During this phase, the block is defined with the HTML DOM-tree path to the block and the tag sequence of the block. In the extraction phase, the rules are applied to the target pages to extract those blocks that have similar structure as the user-defined block. A series of experiments are performed to evaluate the user-defined data block extraction method for a number of well-known Web sites with dynamic Web pages, and the result of evaluation is satisfactory with high precision and recall measures.

advanced information networking and applications | 2009

An Enhanced Clustering Method Based on Grid-Shaking

Jinbeom Kang; Joongmin Choi; Jaeyoung Yang

Clustering is an essential way to extract meaningful information from massive data without human intervention in the field of data mining. Clustering algorithms can be divided into four types: partitioning algorithms, hierarchical algorithms, grid-based algorithms, and locality-based algorithms. Each algorithm, however, has problems that are not easily solved. K-means, for example, suffer from setting up an initial centroid problem when distribution of data is not hyper-ellipsoid. Chain effect, outlier, and degree of density in data are problems occurring in other types of algorithms. To solve these problems, various kinds of algorithms were proposed. In this paper, we propose a novel grid-based clustering algorithm through building clusters in each cell and show how to solve the previously mentioned problems.

Explore More