Is this you? Create Your Porfile

Thanh Tran

Karlsruhe Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thanh Tran is active.

Explore More

Publication

Featured researches published by Thanh Tran.

international conference on data engineering | 2009

Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data

Thanh Tran; Haofen Wang; Sebastian Rudolph; Philipp Cimiano

Keyword queries enjoy widespread usage as they represent an intuitive way of specifying information needs. Recently, answering keyword queries on graph-structured data has emerged as an important research topic. The prevalent approaches build on dedicated indexing techniques as well as search algorithms aiming at finding substructures that connect the data elements matching the keywords. In this paper, we introduce a novel keyword search paradigm for graph-structured data, focusing in particular on the RDF data model. Instead of computing answers directly as in previous approaches, we first compute queries from the keywords, allowing the user to choose the appropriate query, and finally, process the query using the underlying database engine. Thereby, the full range of database optimization techniques can be leveraged for query processing. For the computation of queries, we propose a novel algorithm for the exploration of top-k matching subgraphs. While related techniques search the best answer trees, our algorithm is guaranteed to compute all k subgraphs with lowest costs, including cyclic graphs. By performing exploration only on a summary data structure derived from the data graph, we achieve promising performance improvements compared to other approaches.

international semantic web conference | 2007

Ontology-based interpretation of keywords for semantic search

Thanh Tran; Philipp Cimiano; Sebastian Rudolph; Rudi Studer

Current information retrieval (IR) approaches do not formally capture the explicit meaning of a keyword query but provide a comfortable way for the user to specify information needs on the basis of keywords. Ontology-based approaches allow for sophisticated semantic search but impose a query syntax more difficult to handle. In this paper, we present an approach for translating keyword queries to DL conjunctive queries using background knowledge available in ontologies. We present an implementation which shows that this interpretation of keywords can then be used for both exploration of asserted knowledge and for a semantics-based declarative query answering process. We also present an evaluation of our system and a discussion of the limitations of the approach with respect to our underlying assumptions which directly points to issues for future work.

international world wide web conferences | 2007

The two cultures: mashing up web 2.0 and the semantic web

Anupriya Ankolekar; Markus Krötzsch; Thanh Tran; Denny Vrandecic

A common perception is that there are two competing visions for the future evolution of the Web: the Semantic Web and Web 2.0. A closer look, though, reveals that the core technologies and concerns of these two approaches are complementary and that each field can and must draw from the others strengths. We believe that future web applications will retain the Web 2.0 focus on community and usability, while drawing on Semantic Web infrastructure to facilitate mashup-like information sharing. However, there are several open issues that must be addressed before such applications can become commonplace. In this paper, we outline a semantic weblogs scenario that illustrates the potential for combining Web 2.0 and Semantic Web technologies, while highlighting the unresolved issues that impede its realization. Nevertheless, we believe that the scenario can be realized in the short-term. We point to recent progress made in resolving each of the issues as well as future research directions for each of the communities.

Journal of Web Semantics | 2008

The two cultures: Mashing up Web 2.0 and the Semantic Web

Anupriya Ankolekar; Markus Krötzsch; Thanh Tran; Denny Vrandecic

A common perception is that there are two competing visions for the future evolution of the Web: the Semantic Web and Web 2.0. A closer look, though, reveals that the core technologies and concerns of these two approaches are complementary and that each field can and must draw from the others strengths. We believe that future Web applications will retain the Web 2.0 focus on community and usability, while drawing on Semantic Web infrastructure to facilitate mashup-like information sharing. However, there are several open issues that must be addressed before such applications can become commonplace. In this paper, we outline a semantic weblogs scenario that illustrates the potential for combining Web 2.0 and Semantic Web technologies, while highlighting the unresolved issues that impede its realization. Nevertheless, we believe that the scenario can be realized in the short-term. We point to recent progress made in resolving each of the issues as well as future research directions for each of the communities.

international semantic web conference | 2011

FedBench: a benchmark suite for federated semantic data query processing

Michael Schmidt; Olaf Görlitz; Peter Haase; Günter Ladwig; Andreas Schwarte; Thanh Tran

In this paper we present FedBench, a comprehensive benchmark suite for testing and analyzing the performance of federated query processing strategies on semantic data. The major challenge lies in the heterogeneity of semantic data use cases, where applications may face different settings at both the data and query level, such as varying data access interfaces, incomplete knowledge about data sources, availability of different statistics, and varying degrees of query expressiveness. Accounting for this heterogeneity, we present a highly flexible benchmark suite, which can be customized to accommodate a variety of use cases and compare competing approaches. We discuss design decisions, highlight the flexibility in customization, and elaborate on the choice of data and query sets. The practicability of our benchmark is demonstrated by a rigorous evaluation of various application scenarios, where we indicate both the benefits as well as limitations of the state-of-the-art federated query processing strategies for semantic data.

european semantic web conference | 2008

Q2Semantic: a lightweight keyword interface to semantic search

Haofen Wang; Kang Zhang; Qiaoling Liu; Thanh Tran; Yong Yu

The increasing amount of data on the Semantic Web offers opportunities for semantic search. However, formal query hinders the casual users in expressing their information need as they might be not familiar with the querys syntax or the underlying ontology. Because keyword interfaces are easier to handle for casual users, many approaches aim to translate keywords to formal queries. However, these approaches yet feature only very basic query ranking and do not scale to large repositories. We tackle the scalability problem by proposing a novel clustered-graph structure that corresponds to only a summary of the original ontology. The so reduced data space is then used in the exploration for the computation of top-k queries. Additionally, we adopt several mechanisms for query ranking, which can consider many factors such as the query length, the relevance of ontology elements w.r.t. the query and the importance of ontology elements. The experimental results performed against our implemented system Q2Semantic show that we achieve good performance on many datasets of different sizes.

international semantic web conference | 2010

Linked data query processing strategies

Günter Ladwig; Thanh Tran

Recently, processing of queries on linked data has gained attention. We identify and systematically discuss three main strategies: a bottom-up strategy that discovers new sources during query processing by following links between sources, a top-down strategy that relies on complete knowledge about the sources to select and process relevant sources, and a mixed strategy that assumes some incomplete knowledge and discovers new sources at run-time. To exploit knowledge discovered at run-time, we propose an additional step, explicitly scheduled during query processing, called correct source ranking. Additionally, we propose the adoption of stream-based query processing to deal with the unpredictable nature of data access in the distributed Linked Data environment. In experiments, we show that our implementation of the mixed strategy leads to early reporting of results and thus, more responsive query processing, while not requiring complete knowledge.

Journal of Web Semantics | 2009

Semplore: A scalable IR approach to search the Web of Data

Haofen Wang; Qiaoling Liu; Linyun Fu; Lei Zhang; Thanh Tran; Yong Yu; Yue Pan

The Web of Data keeps growing rapidly. However, the full exploitation of this large amount of structured data faces numerous challenges like usability, scalability, imprecise information needs and data change. We present Semplore, an IR-based system that aims at addressing these issues. Semplore supports intuitive faceted search and complex queries both on text and structured data. It combines imprecise keyword search and precise structured query in a unified ranking scheme. Scalable query processing is supported by leveraging inverted indexes traditionally used in IR systems. This is combined with a novel block-based index structure to support efficient index update when data changes. The experimental results show that Semplore is an efficient and effective system for searching the Web of Data and can be used as a basic infrastructure for Web-scale Semantic Web search engines.

extended semantic web conference | 2011

SIHJoin: querying remote and local linked data

Günter Ladwig; Thanh Tran

The amount of Linked Data is increasing steadily. Optimized top-down Linked Data query processing based on complete knowledge about all sources, bottom-up processing based on run-time discovery of sources as well as a mixed strategy that combines them have been proposed. A particular problem with Linked Data processing is that the heterogeneity of the sources and access options lead to varying input latency, rendering the application of blocking join operators infeasible. Previous work partially address this by proposing a non-blocking iterator-based operator and another one based on symmetric-hash join. Here, we propose detailed cost models for these two operators to systematically compare them, and to allow for query optimization. Further, we propose a novel operator called the Symmetric Index Hash Join to address one open problem of Linked Data query processing: to query not only remote, but also local Linked Data. We perform experiments on real-world datasets to compare our approach against the iterator-based baseline, and create a synthetic dataset to more systematically analyze the impacts of the individual components captured by the proposed cost models.

Journal of Web Semantics | 2009

Hermes: Data Web search on a pay-as-you-go integration infrastructure

Thanh Tran; Haofen Wang; Peter Haase

The Web as a global information space is developing from a Web of documents to a Web of data. This development opens new ways for addressing complex information needs. Search is no longer limited to matching keywords against documents, but instead complex information needs can be expressed in a structured way, with precise answers as results. In this paper, we present Hermes, an infrastructure for data Web search that addresses a number of challenges involved in realizing search on the data Web. To provide an end-user oriented interface, we support expressive user information needs by translating keywords into structured queries. We integrate heterogeneous Web data sources with automatically computed mappings. Schema-level mappings are exploited in constructing structured queries against the integrated schema. These structured queries are decomposed into queries against the local Web data sources, which are then processed in a distributed way. Finally, heterogeneous result sets are combined using an algorithm called map join, making use of data-level mappings. In evaluation experiments with real life data sets from the data Web, we show the practicability and scalability of the Hermes infrastructure.

Explore More