Leon Wu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Leon Wu is active.

Explore More

Publication

Featured researches published by Leon Wu.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2012

Machine Learning for the New York City Power Grid

Cynthia Rudin; David L. Waltz; Roger N. Anderson; Albert Boulanger; Ansaf Salleb-Aouissi; Maggie Chow; Haimonti Dutta; Philip Gross; Bert Huang; Steve Ierome; Delfina Isaac; Arthur Kressner; Rebecca J. Passonneau; Axinia Radeva; Leon Wu

Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of failures for components and systems. These models can be used directly by power companies to assist with prioritization of maintenance and repair work. Specialized versions of this process are used to produce (1) feeder failure rankings, (2) cable, joint, terminator, and transformer rankings, (3) feeder Mean Time Between Failure (MTBF) estimates, and (4) manhole events vulnerability rankings. The process in its most general form can handle diverse, noisy, sources that are historical (static), semi-real-time, or real-time, incorporates state-of-the-art machine learning algorithms for prioritization (supervised ranking or MTBF), and includes an evaluation of results via cross-validation and blind test. Above and beyond the ranked lists and MTBF estimates are business management interfaces that allow the prediction capability to be integrated directly into corporate planning and decision support; such interfaces rely on several important properties of our general modeling approach: that machine learning features are meaningful to domain experts, that the processing of data is transparent, and that prediction results are accurate enough to support sound decision making. We discuss the challenges in working with historical electrical grid data that were not designed for predictive purposes. The “rawness” of these data contrasts with the accuracy of the statistical models that can be obtained from the process; these models are sufficiently accurate to assist in maintaining New York Citys electrical grid.

software engineering and knowledge engineering | 2011

BUGMINER: Software Reliability Analysis Via Data Mining of Bug Reports

Leon Wu; Boyi Xie; Gail E. Kaiser; Rebecca J. Passonneau

Software bugs reported by human users and automatic error reporting software are often stored in some bug tracking tools (e.g., Bugzilla and Debbugs). These accumulated bug reports may contain valuable information that could be used to improve the quality of the bug reporting, reduce the quality assurance effort and cost, analyze software reliability, and predict future bug report trend. In this paper, we present BUGMINER, a tool that is able to derive useful information from historic bug report database using data mining, use these information to do completion check and redundancy check on a new or given bug report, and to estimate the bug report trend using statistical analysis. Our empirical studies of the tool using several real-world bug report repositories show that it is effective, easy to implement, and has relatively high accuracy despite low quality data.

Archive | 2011

Forecasting Energy Demand in Large Commercial Buildings Using Support Vector Machine Regression

David Solomon; Rebecca Lynn Winter; Albert Boulanger; Roger N. Anderson; Leon Wu

As our society gains a better understanding of how humans have negatively impacted the environment, research related to reducing carbon emissions and overall energy consumption has become increasingly important. One of the simplest ways to reduce energy usage is by making current buildings less wasteful. By improving energy efficiency, this method of lowering our carbon footprint is particularly worthwhile because it reduces energy costs of operating the building, unlike many environmental initiatives that require large monetary investments. In order to improve the efficiency of the heating, ventilation, and air conditioning (HVAC) system of a Manhattan skyscraper, 345 Park Avenue, a predictive computer model was designed to forecast the amount of energy the building will consume. This model uses Support Vector Machine Regression (SVMR), a method that builds a regression based purely on historical data of the building, requiring no knowledge of its size, heating and cooling methods, or any other physical properties. SVMR employs time-delay coordinates as a representation of the past to create the feature vectors for SVM training. This pure dependence on historical data makes the model very easily applicable to different types of buildings with few model adjustments. The SVM regression model was built to predict a week of future energy usage based on past energy, temperature, and dew point temperature data.

software engineering and knowledge engineering | 2011

Constructing Subtle Concurrency Bugs Using Synchronization-CentricSecond-Order Mutation Operators

Leon Wu; Gail E. Kaiser

Mutation testing applies mutation operators to modify program source code or byte code in small ways, and then runs these modified programs (i.e., mutants) against a test suite in order to evaluate the quality of the test suite. In this paper, we first describe a general fault model for concurrent programs and some limitations of previously developed sets of first-order concurrency mutation operators. We then present our new mutation testing approach, which employs synchronization-centric second-order mutation operators that are able to generate subtle concurrency bugs not represented by the first-order mutation. These operators are used in addition to the synchronization-centric first-order mutation operators to form a small set of effective concurrency mutation operators for mutant generation. Our empirical study shows that our set of operators is effective in mutant generation with limited cost and demonstrates that this new approach is easy to implement.

Archive | 2011

Evaluating Machine Learning for Improving Power Grid Reliability

Leon Wu; Gail E. Kaiser; Cynthia Rudin; David L. Waltz; Roger N. Anderson; Albert Boulanger; Ansaf Salleb-Aouissi; Haimonti Dutta; Manoj Pooleery

Ensuring reliability as the electrical grid morphs into the “smart grid” will require innovations in how we assess the state of the grid, for the purpose of proactive maintenance, rather than reactive maintenance – in the future, we will not only react to failures, but also try to anticipate and avoid them using predictive modeling (machine learning) techniques. To help in meeting this challenge, we present the Neutral Online Visualization-aided Autonomic evaluation framework (NOVA) for evaluating machine learning algorithms for preventive maintenance on the electrical grid. NOVA has three stages provided through a unified user interface: evaluation of input data quality, evaluation of machine learning results, and evaluation of the reliability improvement of the power grid. A prototype version of NOVA has been deployed for the power grid in New York City, and it is able to evaluate machine learning systems effectively and efficiently. Appearing in the ICML 2011 Workshop on Machine Learning for Global Challenges, Bellevue, WA, USA, 2011. Copyright 2011 by the author(s)/owner(s).

IEEE Transactions on Smart Grid | 2013

A Robust Solution to the Load Curtailment Problem

Hugo P. Simão; H. B. Jeong; Boris Defourny; Warren B. Powell; Albert Boulanger; Ashish Gagneja; Leon Wu; Roger N. Anderson

Operations planning in smart grids is likely to become a more complex and demanding task in the next decades. In this paper we show how to formulate the problem of planning short-term load curtailment in a dense urban area, in the presence of uncertainty in electricity demand and in the state of the distribution grid, as a stochastic mixed-integer optimization problem. We propose three rolling-horizon look-ahead policies to approximately solve the optimization problem: a deterministic one and two based on approximate dynamic programming (ADP) techniques. We demonstrate through numerical experiments that the ADP-based policies yield curtailment plans that are more robust on average than the deterministic policy, but at the expense of the additional computational burden needed to calibrate the ADP-based policies. We also show how the worst case performance of the three approximation policies compares with a baseline policy where all curtailable loads are curtailed to the maximum amount possible.

high assurance systems engineering | 2012

An Autonomic Reliability Improvement System for Cyber-Physical Systems

Leon Wu; Gail E. Kaiser

System reliability is a fundamental requirement of cyber-physical systems. Unreliable systems can lead to disruption of service, financial cost and even loss of human life. Typical cyber-physical systems are designed to process large amounts of data, employ software as a system component, run online continuously and retain an operator-in-the-loop because of human judgment and accountability requirements for safety-critical systems. This paper describes a data-centric runtime monitoring system named ARIS (Autonomic Reliability Improvement System) for improving the reliability of these types of cyber-physical systems. ARIS employs automated online evaluation, working in parallel with the cyber-physical system to continuously conduct automated evaluation at multiple stages in the system workflow and provide real-time feedback for reliability improvement. This approach enables effective evaluation of data from cyber-physical systems. For example, abnormal input and output data can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop, who can then take actions and make changes to the system based on these alerts in order to achieve minimal system downtime and higher system reliability. We have implemented ARIS in a large commercial building cyber-physical system in New York City, and our experiment has shown that it is effective and efficient in improving building system reliability.

Archive | 2010

Empirical Study of Concurrency Mutation Operators for Java

Leon Wu; Gail E. Kaiser

Mutation testing is a white-box fault-based software testing technique that applies mutation operators to modify program source code or byte code in small ways and then runs these modified programs (i.e., mutants) against a test suite in order to measure its effectiveness and locate the weaknesses either in the test data or in the program that are seldom or never exposed during normal execution. In this paper, we describe our implementation of a generic mutation testing framework and the results of applying three sets of concurrency mutation operators on four example Java programs through empirical study and analysis. Keywords-software testing; mutation testing; mutation operators; Java; concurrent programs; synchronization;

ieee systems conference | 2014

Cost-optimal, robust charging of electrically-fueled commercial vehicle fleets via machine learning

Jigar Shah; Matthew Christian Nielsen; Andrew Reid; Conner B. Shane; Kirk Mathews; David Henry Doerge; Richard Piel; Roger N. Anderson; Albert Boulanger; Leon Wu; Vaibhav Bhandari; Ashish Gagneja; Arthur Kressner; Xiaohu Li; Somnath Sarkar

Electrification for commercial vehicle fleets presents opportunity to cut emissions, reduce fuel costs, and improve operational metrics. However, infrastructure limitations in urban areas often inhibit the ability to charge a significant number of electric vehicles, especially under one roof. This paper highlights a novel controls approach developed at GE Global Research in conjunction with Columbia University to fulfill the stated needs for intelligent charging of a commercial fleet of electric vehicles. This novel approach combines traditional control techniques with machine learning algorithms to adapt to customer behavior over time. The stated controls system is designed to regulate the charging rate of multiple electric vehicle supply equipment devices (EVSEs) to facilitate cost-optimal charging subject to past and predicted building load, vehicle energy requirements, and current conditions. In this embodiment, the system is primarily designed to mitigate electric demand charges that may otherwise occur due to charging at inopportune times. The system will be deployed at a New York City FedEx Express delivery depot in partnership with the local utility, Consolidated Edison Company of New York.

Archive | 2008

Distributed eXplode: A High-Performance Model Checking Engine to Scale Up State-Space Coverage

Nageswar Keetha; Leon Wu; Gail E. Kaiser; Junfeng Yang

Model checking the state space (all possible behaviors) of software systems is a promising technique for verification and validation. Bugs such as security vulnerabilities, file storage issues, deadlocks and data races can occur anywhere in the state space and are often triggered by corner cases; therefore, it becomes important to explore and model check all runtime choices. However, large and complex software systems generate huge numbers of behaviors leading to ‘state explosion’. eXplode is a lightweight, deterministic and depth-bound model checker that explores all dynamic choices at runtime. Given an application-specific test-harness, eXplode performs state search in a serialized fashion which limits its scalability and performance. This paper proposes a distributed eXplode engine that uses multiple host machines concurrently in order to achieve more state space coverage in less time, and is very helpful to scale up the software verification and validation effort. Test results show that Distributed eXplode runs several times faster and covers more state space than the standalone eXplode.

Explore More