Wang-Pin Hsiung | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wang-Pin Hsiung is active.

Explore More

Publication

Featured researches published by Wang-Pin Hsiung.

very large data bases | 2002

View invalidation for dynamic content caching in multitiered architectures

K. Selçuk Candan; Divyakant Agrawal; Wen-Syan Li; Oliver Po; Wang-Pin Hsiung

In todays multitiered application architectures, clients do not access data stored in the databases directly. Instead, they use applications which in turn invoke the DBMS to generate the relevant content. Since executing application programs may require significant time and other resources, it is more advantageous to cache application results in a result cache. Various view materialization and update management techniques have been proposed to deal with updates to the underlying data. These techniques guarantee that the cached results are always consistent with the underlying data. Several applications, including e-commerce sites, on the other hand, do not require the caches be consistent all the time. Instead, they require that all outdated pages in the caches are invalidated in a timely fashion. In this paper, we show that invalidation is inherently different from view maintenance. We develop algorithms that benefit from this difference in reducing the cost of update management in certain applications and we present an invalidation framework that benefits from these algorithms.

extending database technology | 2013

PMAX: tenant placement in multitenant databases for profit maximization

Ziyang Liu; Hakan Hacigümüs; Hyun Jin Moon; Yun Chi; Wang-Pin Hsiung

There has been a great interest in exploiting the cloud as a platform for database as a service. As with other cloud-based services, database services may enjoy cost efficiency through consolidation: hosting multiple databases within a single physical server. Aggressive consolidation, however, may hurt the service quality, leading to SLA violation penalty, which in turn reduces the total business profit, called SLA profit. In this paper, we consider the problem of tenant placement in the cloud for SLA profit maximization, which, as will be shown in the paper, is strongly NP-hard. We propose SLA profit-aware solutions for database tenant placement based on our model for expected penalty computation for multitenant servers. Specifically, we present two approximation algorithms, which have constant approximation ratios, and we further discuss improving the quality of tenant placement using a dynamic programming algorithm. Extensive experiments based on TPC-W workload verified the performance of the proposed approaches.

very large data bases | 2002

Issues and evaluations of caching solutions for web application acceleration

Wen-Syan Li; Wang-Pin Hsiung; Dmitri V. Kalashnikov; Radu Sion; Oliver Po; Divyakant Agrawal; K. Selçuk Candan

Response time is a key differentiation among electronic commerce (e-commerce) applications. For many e-commerce applications, Web pages are created dynamically based on the current state of a business stored in database systems. Recently, the topic of Web acceleration for database-driven Web applications has drawn a lot of attention in both the research community and commercial arena. In this paper, we analyze the factors that have impacts on the performance and scalability of Web applications. We discuss system architecture issues and describe approaches to deploying caching solutions for accelerating Web applications. We give the performance matrix measurement for network latency and various system architectures. The paper is summarized with a road map for creating high performance Web applications.

world congress on services | 2010

CloudDB: One Size Fits All Revived

Hakan Hacigümüs; Junichi Tatemura; Wang-Pin Hsiung; Hyun Jin Moon; Oliver Po; Arsany Sawires; Yun Chi; Hojjat Jafarpour

We present a data management platform in the cloud, CloudDB. The guiding principle of CloudDB’s design is establishing data independence for the applications that need to use diverse underlying data stores that are optimized for varying workload needs and characteristics. The applications should not have to be aware of the physical organization of the data and how the data is accessed. Ideally, an application only needs a logical specification of the data access layer and the data access requests are handled in a declarative way. CloudDB hosts variety of specialized databases that deliver high performance, scalability, and cost efficiency for varying application needs. CloudDB’s API layer is designed in such a way to give data independence to the higher level applications. The goal is to let the clients use just a simple, standard, and uniform language API to access data management functions as a service.

IEEE Transactions on Knowledge and Data Engineering | 2008

Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams

Songting Chen; Hua-Gang Li; Junichi Tatemura; Wang-Pin Hsiung; Divyakant Agrawal; Kasim Selcuk Candan

An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries.

extending database technology | 2013

SWAT: a lightweight load balancing method for multitenant databases

Hyun Jin Moon; Hakan Hacigümüs; Yun Chi; Wang-Pin Hsiung

Multitenant databases achieve cost efficiency through the consolidation of multiple small tenants. However, performance isolation is an inherent problem in multitenant databases due to resource sharing among the tenants. That is, a bursty workload from a co-located tenant, i.e., a noisy neighbor, may affect the performance of the other tenants sharing the same system resources. We address this issue by using a load balancing method that is based on database replica swap. Unlike the traditional data migration-based load balancing, replica swap-based load balancing does not incur data movement, which makes it highly resource- and time-efficient. We propose a novel method of choosing which tenants should be subject to swaps. Our experimental results show that swap-based load balancing effectively reduces the number of SLA violations, which is the main performance metric we choose.

very large data bases | 2003

CachePortal II: acceleration of very large scale data center-hosted database-driven web applications

Wen-Syan Li; Oliver Po; Wang-Pin Hsiung; K. Selçuk Candan; Divyakant Agrawal; Yusuf Akca; Kunihiro Taniguchi

Wide-area database replication technologies and the avail-ability of data centers allow database copies to be dis-tributed across the network. This requires a complete e-commercewebsite suite (i.e. edgecaches, Web servers, ap-plication servers, and DBMS) to be distributed along withthe database replicas. A major advantage of this approachis, like the caches, the possibility of serving dynamic con-tent from a location close to the users, reducing networklatency. However, this is achieved at the expense of ad-ditional overhead, caused by the need of invalidating dy-namic content cached in the edge caches and synchroniza-tion of the database replicas in the data center.A typical data center architecture for hosting Web ap-plications requires a complete e-commerce Web site suite(i.e. Web server, application server, and DBMS) to be dis-tributed along with the database replicas. Typically, theWS/AS/DBMS suite is installed in the network to servenon-transaction requests which require accesses to read-only database replicas of the master database at the ori-gin site. In order to distinguish between the asymmetricfunctionalityof master and slave DBMSs, we refer the mir-ror database in the data center as data cache or DB Cache.DBCache can be a lightweight DBMS without the trans-action management system and it may cache only a sub-set of the tables in the master database. Updates to thedatabase are handled using a master/slave database cong-uration: all updates and transactions are processed at themaster database at the origin site.This architecture has two drawbacks: (1) all requests

international conference on management of data | 2012

Partiqle: an elastic SQL engine over key-value stores

Junichi Tatemura; Oliver Po; Wang-Pin Hsiung; Hakan Hacigümüs

The demo features Partiqle, a SQL engine over key-value stores as a relational alternative for the recent procedural approaches to support OLTP workloads elastically. Based on our microsharding framework [12], it employs a declarative specification, called transaction classes, of constraints applied on the transactions in a workload. We demonstrate use of a transaction class in design and analysis of OLTP workloads. We then demonstrate live-scaling of our fully functioning system on a server cluster.

international world wide web conferences | 2004

Challenges and practices in deploying web acceleration solutions for distributed enterprise systems

Wen-Syan Li; Wang-Pin Hsiung; Oliver Po; Koji Hino; Kasim Selcuk Candan; Divyakant Agrawal

For most Web-based applications, contents are created dynamically based on the current state of a business, such as product prices and inventory, stored in database systems. These applications demand personalized content and track user behavior while maintaining application integrity. Many of such practices are not compatible with Web acceleration solutions. Consequently, although many web acceleration solutions have shown promising performance improvement and scalability, architecting and engineering distributed enterprise Web applications to utilize available content delivery networks remains a challenge. In this paper, we examine the challenge to accelerate J2EE-based enterprise web applications. We list obstacles and recommend some practices to transform typical database-driven J2EE applications to cache friendly Web applications where Web acceleration solutions can be applied. Furthermore, such transformation should be done without modification to the underlying application business logic and without sacrificing functions that are essential to e-commerce. We take the J2EE reference software, the Java PetStore, as a case study. By using the proposed guideline, we are able to cache more than 90% of the content in the PetStore and scale up the Web site more than 20 times.

international conference on data engineering | 2014

Automatic entity-grouping for OLTP workloads

Bin Liu; Junichi Tatemura; Oliver Po; Wang-Pin Hsiung; Hakan Hacigümüs

Supporting an online transaction processing (OLTP) workload in a scalable and elastic fashion is a challenging task. Recently, a new breed of scalable systems have shown significant throughput gains by limiting consistency to small units of data called “entity-groups” (e.g., a users account information stored together with all her emails in an online email service.) Transactions that access the data from only one entity-group are guaranteed of full ACID, but those that access multiple entity-groups are not. Defining entity-groups has direct impact on workload consistency and performance, and doing so for data with a complex schema is very challenging. It is prone to go to extremes - groups that are too fine-grained cause excessive number of expensive distributed transactions while those that are too coarse lead to excessive serialization and performance degradation. It is also difficult to balance conflicting requirements from different transactions. In commercially available entity-group systems, creating entity-groups is usually a manual process, which severely limits the usability of those systems. This paper is the first systematic effort on automating the entity-group design process. Our goal is to build a user-friendly design tool for automatically creating entity-groups based on a given workload and to help users trade consistency for performance in a principled manner. For advanced users, we allow them to provide feedback to the entity-group design and iteratively improve the final output. We demonstrate the effectiveness of our approach with widely used benchmarks. We also present the user experience of a prototype we built.

Explore More