Is this you? Create Your Porfile

Benjamin W. Wah

The Chinese University of Hong Kong

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Benjamin W. Wah is active.

Explore More

Publication

Featured researches published by Benjamin W. Wah.

IEEE Transactions on Knowledge and Data Engineering | 1989

Knowledge and data engineering

C. V. Ramamoorthy; Benjamin W. Wah

The authors provide an overview of the current research and development directions in knowledge and data engineering. They classify research problems and approaches in this area and discuss future trends. Research on knowledge and data engineering is examined with respect to programmability and representation, design tradeoffs, algorithms and control, and emerging technologies. Future challenges are considered with respect to software and hardware architecture and system design. The paper serves as an introduction to this first issue of a new quarter. >

very large data bases | 2002

Multi-dimensional regression analysis of time-series data streams

Yixin Chen; Guozhu Dong; Jiawei Han; Benjamin W. Wah; Jianyoung Wang

Real-time production systems and other dynamic environments often generate tremendous (potentially infinite) amount of stream data; the volume of data is too huge to be stored on disks or scanned multiple times. Can we perform on-line, multi-dimensional analysis and data mining of such data to alert people about dramatic changes of situations and to initiate timely, high-quality responses? This is a challenging task. In this paper, we investigate methods for on-line, multi-dimensional regression analysis of time-series stream data, with the following contributions: (1) our analysis shows that only a small number of compressed regression measures instead of the complete stream of data need to be registered for multi-dimensional linear regression analysis, (2) to facilitate on-line stream data analysis, a partially materialized data cube model, with regression as measure, and a tilt time frame as its time dimension, is proposed to minimize the amount of data to be retained in memory or stored on disks, and (3) an exception-guided drilling approach is developed for on-line, multi-dimensional exception-based regression analysis. Based on this design, algorithms are proposed for efficient analysis of time-series data streams. Our performance study compares the proposed algorithms and identifies the most memory- and time- efficient one for multi-dimensional stream data analysis.

Big Data Research | 2015

Significance and Challenges of Big Data Research

Xiaolong Jin; Benjamin W. Wah; Xueqi Cheng; Yuanzhuo Wang

In recent years, the rapid development of Internet, Internet of Things, and Cloud Computing have led to the explosive growth of data in almost every industry and business area. Big data has rapidly developed into a hot topic that attracts extensive attention from academia, industry, and governments around the world. In this position paper, we first briefly introduce the concept of big data, including its definition, features, and value. We then identify from different perspectives the significance and opportunities that big data brings to us. Next, we present representative big data initiatives all over the world. We describe the grand challenges (namely, data complexity, computational complexity, and system complexity), as well as possible solutions to address these challenges. Finally, we conclude the paper by presenting several suggestions on carrying out big data projects.

international symposium on multimedia | 2000

A survey of error-concealment schemes for real-time audio and video transmissions over the Internet

Benjamin W. Wah; Xiao Su; Dong Lin

Real-time audio and video data streamed over unreliable IP networks, such as the Internet, may encounter losses due to dropped packets or late arrivals. This paper reviews error-concealment schemes developed for streaming real-time audio and video data over the Internet. Based on their interactions with (video or audio) source coders, we classify existing techniques into source coder-independent schemes that treat underlying source coders as black boxes, and source coder-dependent schemes that exploit coder-specific characteristics to perform reconstruction. Last, we identify possible future research directions.

computational science and engineering | 1996

Global optimization for neural network training

Yi Shang; Benjamin W. Wah

We propose a novel global minimization method, called NOVEL (Nonlinear Optimization via External Lead), and demonstrate its superior performance on neural network learning problems. The goal is improved learning of application problems that achieves either smaller networks or less error prone networks of the same size. This training method combines global and local searches to find a good local minimum. In benchmark comparisons against the best global optimization algorithms, it demonstrates superior performance improvement.

Journal of Global Optimization | 1998

A Discrete Lagrangian-Based Global-SearchMethod for Solving Satisfiability Problems

Yi Shang; Benjamin W. Wah

Satisfiability is a class of NP-complete problems that model a wide range of real-world applications. These problems are difficult to solve because they have many local minima in their search space, often trapping greedy search methods that utilize some form of descent. In this paper, we propose a new discrete Lagrange-multiplier-based global-search method (DLM) for solving satisfiability problems. We derive new approaches for applying Lagrangian methods in discrete space, we show that an equilibrium is reached when a feasible assignment to the original problem is found and present heuristic algorithms to look for equilibrium points. Our method and analysis provides a theoretical foundation and generalization of local search schemes that optimize the objective alone and penalty-based schemes that optimize the constraints alone. In contrast to local search methods that restart from a new starting point when a search reaches a local trap, the Lagrange multipliers in DLM provide a force to lead the search out of a local minimum and move it in the direction provided by the Lagrange multipliers. In contrast to penalty-based schemes that rely only on the weights of violated constraints to escape from local minima, DLM also uses the value of an objective function (in this case the number of violated constraints) to provide further guidance. The dynamic shift in emphasis between the objective and the constraints, depending on their relative values, is the key of Lagrangian methods. One of the major advantages of DLM is that it has very few algorithmic parameters to be tuned by users. Besides the search procedure can be made deterministic and the results reproducible. We demonstrate our method by applying it to solve an extensive set of benchmark problems archived in DIMACS of Rutgers University. DLM often performs better than the best existing methods and can achieve an order-of-magnitude speed-up for some problems.

Journal of Artificial Intelligence Research | 2006

Temporal planning using subgoal partitioning and resolution in SGPlan

Yixin Chen; Benjamin W. Wah; Chih-Wei Hsu

In this paper, we present the partitioning of mutual-exclusion (mutex) constraints in temporal planning problems and its implementation in the SGPlan4 planner. Based on the strong locality of mutex constraints observed in many benchmarks of the Fourth International Planning Competition (IPC4), we propose to partition the constraints of a planning problem into groups based on their subgoals. Constraint partitioning leads to significantly easier subproblems that are similar to the original problem and that can be efficiently solved by the same planner with some modifications to its objective function. We present a partition-and-resolve strategy that looks for locally optimal subplans in constraint-partitioned temporal planning subproblems and that resolves those inconsistent global constraints across the subproblems. We also discuss some implementation details of SGPlan4, which include the resolution of violated global constraints, techniques for handling producible resources, landmark analysis, path finding and optimization, search-space reduction, and modifications of Metric-FF when used as a basic planner in SGPlan4. Last, we show results on the sensitivity of each of these techniques in quality-time trade-offs and experimentally demonstrate that SGPlan4 is effective for solving the IPC3 and IPC4 benchmarks.

very large data bases | 2003

Star-cubing: computing iceberg cubes by top-down and bottom-up integration

Dong Xin; Jiawei Han; Xiaolei Li; Benjamin W. Wah

Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down vs. bottom-up. The former, represented by the Multi-Way Array Cube (called MultiWay) algorithm [25], aggregates simultaneously on multiple dimensions; however, it cannot take advantage of Apriori pruning [2] when computing iceberg cubes (cubes that contain only aggregate cells whose measure value satisfies a threshold, called iceberg condition). The latter, represented by two algorithms: BUC [6] and H-Cubing[11], computes the iceberg cube bottom-up and facilitates Apriori pruning. BUC explores fast sorting and partitioning techniques; whereas H-Cubing explores a data structure, H-Tree, for shared computation. However, none of them fully explores multi-dimensional simultaneous aggregation. In this paper, we present a new method, Star-Cubing, that integrates the strengths of the previous three algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-bys that do not satisfy the iceberg condition. Our performance study shows that Star-Cubing is highly efficient and outperforms all the previous methods in almost all kinds of data distributions.

Distributed and Parallel Databases | 2005

Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams

Jiawei Han; Yixin Chen; Guozhu Dong; Jian Pei; Benjamin W. Wah; Jianyong Wang; Y. Dora Cai

Real-time surveillance systems, telecommunication systems, and other dynamic environments often generate tremendous (potentially infinite) volume of stream data: the volume is too huge to be scanned multiple times. Much of such data resides at rather low level of abstraction, whereas most analysts are interested in relatively high-level dynamic changes (such as trends and outliers). To discover such high-level characteristics, one may need to perform on-line multi-level, multi-dimensional analytical processing of stream data. In this paper, we propose an architecture, called stream_cube, to facilitate on-line, multi-dimensional, multi-level analysis of stream data.For fast online multi-dimensional analysis of stream data, three important techniques are proposed for efficient and effective computation of stream cubes. First, a tilted time frame model is proposed as a multi-resolution model to register time-related data: the more recent data are registered at finer resolution, whereas the more distant data are registered at coarser resolution. This design reduces the overall storage of time-related data and adapts nicely to the data analysis tasks commonly encountered in practice. Second, instead of materializing cuboids at all levels, we propose to maintain a small number of critical layers. Flexible analysis can be efficiently performed based on the concept of observation layer and minimal interesting layer. Third, an efficient stream data cubing algorithm is developed which computes only the layers (cuboids) along a popular path and leaves the other cuboids for query-driven, on-line computation. Based on this design methodology, stream data cube can be constructed and maintained incrementally with a reasonable amount of memory, computation cost, and query response time. This is verified by our substantial performance study.

electronic commerce | 1994

Scheduling of genetic algorithms in a noisy environment

Akiko Aizawa; Benjamin W. Wah

In this paper, we develop new methods for adjusting configuration parameters of genetic algorithms operating in a noisy environment. Such methods are related to the scheduling of resources for tests performed in genetic algorithms. Assuming that the population size is given, we address two problems related to the design of efficient scheduling algorithms specifically important in noisy environments. First, we study the durution-scheduling problem that is related to setting dynamically the duration of each generation. Second, we study the sample-allocation problem that entails the adaptive determination of the number of evaluations taken from each candidate in a generation. In our approach, we model the search process as a statistical selection process and derive equations useful for these problems. Our results show that our adaptive procedures improve the performance of genetic algorithms over that of commonly used static ones.

Explore More