Heasoo Hwang
University of California, San Diego
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Heasoo Hwang.
ACM Transactions on Database Systems | 2008
Vagelis Hristidis; Heasoo Hwang; Yannis Papakonstantinou
Our system applies authority-based ranking to keyword search in databases modeled as labeled graphs. Three ranking factors are used: the relevance to the query, the specificity and the importance of the result. All factors are handled using authority-flow techniques that exploit the link-structure of the data graph, in contrast to traditional Information Retrieval. We address the performance challenges in computing the authority flows in databases by using precomputation and exploiting the database schema if present. We conducted user surveys and performance experiments on multiple real and synthetic datasets, to assess the semantic meaningfulness and performance of our system.
international conference on management of data | 2006
Heasoo Hwang; Vagelis Hristidis; Yannis Papakonstantinou
We present ObjectRank demo system that performs authority-based keyword search on bibliographic databases. We also provide Inverse ObjectRank as a keyword-specific specificity metric and other calibration parameters such as Global ObjectRank. Users can specify various combinations of calibration values to control the behavior of the demo. Finally, we propose a methodology that enables us to extend query results using the ontology graph.
IEEE Transactions on Knowledge and Data Engineering | 2012
Heasoo Hwang; Hady Wirawan Lauw; Lise Getoor; Alexandros Ntoulas
Users are increasingly pursuing complex task-oriented goals on the web, such as making travel arrangements, managing finances, or planning purchases. To this end, they usually break down the tasks into a few codependent steps and issue multiple queries around these steps repeatedly over long periods of time. To better support users in their long-term information quests on the web, search engines keep track of their queries and clicks while searching online. In this paper, we study the problem of organizing a users historical queries into groups in a dynamic and automated fashion. Automatically identifying query groups is helpful for a number of different search engine components and applications, such as query suggestions, result ranking, query alterations, sessionization, and collaborative search. In our approach, we go beyond approaches that rely on textual similarity or time thresholds, and we propose a more robust approach that leverages search query logs. We experimentally study the performance of different techniques, and showcase their potential, especially when combined together.
international conference on data engineering | 2009
Heasoo Hwang; Andrey Balmin; Berthold Reinwald; Erik Nijkamp
Dynamic authority-based keyword search algorithms, such as ObjectRank and personalized PageRank, leverage semantic link information to provide high quality, high recall search in databases, and the Web. Conceptually, these algorithms require a query-time PageRank-style iterative computation over the full graph. This computation is too expensive for large graphs, and not feasible at query time. Alternatively, building an index of precomputed results for some or all keywords involves very expensive preprocessing. We introduce BinRank, a system that approximates ObjectRank results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running ObjectRank on only one of the subgraphs. BinRank generates the subgraphs by partitioning all the terms in the corpus based on their co-occurrence, executing ObjectRank for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive non-negligible scores. The intuition is that a subgraph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that BinRank can achieve subsecond query execution time on the English Wikipedia data set, while producing high-quality search results that closely approximate the results of ObjectRank on the original graph. The Wikipedia link graph contains about 10^8 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of BinRank.
IEEE Transactions on Knowledge and Data Engineering | 2010
Heasoo Hwang; Andrey Balmin; Berthold Reinwald; Erik Nijkamp
Dynamic authority-based keyword search algorithms, such as ObjectRank and personalized PageRank, leverage semantic link information to provide high quality, high recall search in databases, and the Web. Conceptually, these algorithms require a query-time PageRank-style iterative computation over the full graph. This computation is too expensive for large graphs, and not feasible at query time. Alternatively, building an index of precomputed results for some or all keywords involves very expensive preprocessing. We introduce BinRank, a system that approximates ObjectRank results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running ObjectRank on only one of the subgraphs. BinRank generates the subgraphs by partitioning all the terms in the corpus based on their co-occurrence, executing ObjectRank for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive non-negligible scores. The intuition is that a subgraph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that BinRank can achieve subsecond query execution time on the English Wikipedia data set, while producing high-quality search results that closely approximate the results of ObjectRank on the original graph. The Wikipedia link graph contains about 10^8 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of BinRank.
international conference on management of data | 2007
Heasoo Hwang; Andrey Balmin; Hamid Pirahesh; Berthold Reinwald
We model heterogeneous data sources with cross references, such as those crawled on the (enterprise) web, as a labeled graph with data objects as typed nodes and references or links as edges. Given the labeled data graph, we introduce flexible and efficient querying capabilities that go beyond existing capabilities by additionally discovering meaningful relationships between objects that satisfy keyword and/or structured query filters. We introduce the relationship search operator that exploits the link structure between data objects to rank objects related to the result of a filter. We implement the search operator using the ObjectRank [1] algorithm that uses the random surfer model. We study several alternatives for constructing summary graphs for query results that consist of individual and aggregate nodes that are somehow linked to qualifying result nodes. Some of the summary graphs are useful for presenting query results to the user, while others could be used to evaluate subsequent queries efficiently without considering all the nodes and links in the original data graph.
international conference on consumer electronics | 2013
Keun-Joo Kwon; Heasoo Hwang; Hyoa Kang; Kyoung-Gu Woo; Kyuseok Shim
Remote monitoring of heart disease patients has been shown to be effective for diagnosis and detection of arrhythmias. We propose a remote cardiac monitoring system for preventive care by developing a decision support system with personalized parameters and an algorithm to predict forthcoming paroxysmal atrial fibrillations. The system consists of several physiological measuring devices, mobile gateways, point-of-care devices, and a monitoring server. The proposed prediction algorithm shows 87.5% accuracy.
Archive | 2008
Andrey Balmin; Heasoo Hwang; Mir Hamid Pirahesh; Berthold Reinwald
Archive | 2009
Andrey Balmin; Heasoo Hwang; Erik Nijkamp; Berthold Reinwald
very large data bases | 2008
Akanksha Baid; Andrey Balmin; Heasoo Hwang; Erik Nijkamp; Jun Rao; Berthold Reinwald; Alkis Simitsis; Yannis Sismanis; Frank van Ham