William Groves | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where William Groves is active.

Explore More

Publication

Featured researches published by William Groves.

Machine Learning | 2016

Multi-target regression via input space expansion: treating targets as inputs

Eleftherios Spyromitros-Xioufis; Grigorios Tsoumakas; William Groves; Ioannis P. Vlahavas

Real world prediction problems often involve the simultaneous prediction of multiple target variables using the same set of predictive variables. When the target variables are binary, the prediction task is called multi-label classification while when the target variables are real-valued the task is called multi-target regression. Although multi-target regression attracted the attention of the research community prior to multi-label classification, the recent advances in this field motivate a study of whether newer state-of-the-art algorithms developed for multi-label classification are applicable and equally successful in the domain of multi-target regression. In this paper we introduce two new multi-target regression algorithms: multi-target stacking (MTS) and ensemble of regressor chains (ERC), inspired by two popular multi-label classification approaches that are based on a single-target decomposition of the multi-target problem and the idea of treating the other prediction targets as additional input variables that augment the input space. Furthermore, we detect an important shortcoming on both methods related to the methodology used to create the additional input variables and develop modified versions of the algorithms (MTSC and ERCC) to tackle it. All methods are empirically evaluated on 12 real-world multi-target regression datasets, 8 of which are first introduced in this paper and are made publicly available for future benchmarks. The experimental results show that ERCC performs significantly better than both a strong baseline that learns a single model for each target using bagging of regression trees and the state-of-the-art multi-objective random forest approach. Also, the proposed modification results in significant performance gains for both MTS and ERC.In many practical applications of supervised learning the task involves the prediction of multiple target variables from a common set of input variables. When the prediction targets are binary the task is called multi-label classification, while when the targets are continuous the task is called multi-target regression. In both tasks, target variables often exhibit statistical dependencies and exploiting them in order to improve predictive accuracy is a core challenge. A family of multi-label classification methods address this challenge by building a separate model for each target on an expanded input space where other targets are treated as additional input variables. Despite the success of these methods in the multi-label classification domain, their applicability and effectiveness in multi-target regression has not been studied until now. In this paper, we introduce two new methods for multi-target regression, called stacked single-target and ensemble of regressor chains, by adapting two popular multi-label classification methods of this family. Furthermore, we highlight an inherent problem of these methods—a discrepancy of the values of the additional input variables between training and prediction—and develop extensions that use out-of-sample estimates of the target variables during training in order to tackle this problem. The results of an extensive experimental evaluation carried out on a large and diverse collection of datasets show that, when the discrepancy is appropriately mitigated, the proposed methods attain consistent improvements over the independent regressions baseline. Moreover, two versions of Ensemble of Regression Chains perform significantly better than four state-of-the-art methods including regularization-based multi-task learning methods and a multi-objective random forest approach.

ACM Transactions on Intelligent Systems and Technology | 2015

On Optimizing Airline Ticket Purchase Timing

William Groves; Maria L. Gini

Proper timing of the purchase of airline tickets is difficult even when historical ticket prices and some domain knowledge are available. To address this problem, we introduce an algorithm that optimizes purchase timing on behalf of customers and provides performance estimates of its computed action policy. Given a desired flight route and travel date, the algorithm uses machine-learning methods on recent ticket price quotes from many competing airlines to predict the future expected minimum price of all available flights. The main novelty of our algorithm lies in using a systematic feature-selection technique, which captures time dependencies in the data by using time-delayed features, and reduces the number of features by imposing a class hierarchy among the raw features and pruning the features based on in-situ performance. Our algorithm achieves much closer to the optimal purchase policy than other existing decision theoretic approaches for this domain, and meets or exceeds the performance of existing feature-selection methods from the literature. Applications of our feature-selection process to other domains are also discussed.

AMEC/TADA | 2011

Improving Prediction in TAC SCM by Integrating Multivariate and Temporal Aspects via PLS Regression

William Groves; Maria L. Gini

We address the construction of a prediction model from data available in a complex environment. We first present a data extraction method that is able to leverage information contained in the movements of all variables in recent observations. This improved data extraction is then used with a common multivariate regression technique: Partial Least Squares (PLS) regression. We experimentally validate this combined data extraction and modeling with data from a competitive multi-agent supply chain setting, the Trading Agent Competition for Supply Chain Management (TAC SCM). Our method achieves competitive (and often superior) performance compared to the state-of-the-art domain-specific prediction techniques used in the 2008 Prediction Challenge competition.

workshop on e-business | 2009

Analyzing Market Interactions in a Multi-agent Supply Chain Environment

William Groves; John Collins; Wolfgang Ketter; Maria L. Gini

Enterprises continuously seek decision support tools that can help automate and codify business decisions. This is particularly true in the business of consumer electronics manufacturing where components are often interchangeable and several manufacturers can supply the same component over the life of a product. In this kind of dynamic environment, businesses are faced with the choice of signing long-term (possibly quite risky) contracts or of waiting to procure necessary components on the spot market (where availability may be uncertain). Having analytical tools to analyze previous and forecast future market conditions is invaluable. We analyze a supply chain scenario from an economic perspective that involves both component procurement and sales uncertainties. The data we analyze comes from a multi-agent supply chain management simulation environment (TAC SCM) which simulates a one-year product life-cycle. The availability of simulation logs allows us access to a rich set of data which includes the requests and actions taken by all participants in the market. This rich informational access enables us to calculate supply and demand curves, examine market efficiency, and see how specific strategic behaviors of the competing agents are reflected in market dynamics.

computing frontiers | 2014

A framework for predicting trajectories using global and local information

William Groves; Ernesto Nunes; Maria L. Gini

We propose a novel framework for predicting the paths of vehicles that move on a road network. The framework leverages global and local patterns in spatio-temporal data. From a large corpus of GPS trajectories, we predict the subsequent path of an in-progress vehicle trajectory using only spatio-temporal features from the data. Our framework consists of three components: (1) a component that abstracts GPS location data into a graph at the neighborhood or street level, (2) a component that generates policies obtained from the graph data, and (3) a component that predicts the subsequent path of an in-progress trajectory. Hierarchical clustering is used to construct the city graph, where the clusters facilitate a compact representation of the trajectory data to make processing large data sets tractable and efficient. We propose four alternative policy generation algorithms: a frequency-based algorithm (FreqCount), a correlation-based algorithm (EigenStrat), a spectral clusteringbased algorithm (LapStrat), and a Markov Chain-based algorithm (MCStrat). The algorithms explore either global patterns (FreqCount and EigenStrat) or local patterns (MCStrat) in the data, with the exception of LapStrat which explores both. We present an analysis of the performance of the alternative prediction algorithms using a large real-world taxi data set.

international joint conference on artificial intelligence | 2011

Combining spatial and temporal aspects of prediction problems to improve prediction performance

William Groves

Quantitative prediction problems involving both spatial and temporal components have appeared prominently in several disparate research areas including finance, supply chain management, and civil engineering. Unfortunately, either the spatial or temporal aspect tends to dominate the other in many prediction formulations. We briefly examine the underlying formulations used in spatial and temporal prediction. Then, we outline a method that combines these approaches and improves prediction results in high-dimensional economic domains by integrating multivariate and time series techniques which require minimal tuning but achieve superior performance compared to previous methods. We present preliminary results in the context of the Trading Agent Competition for Supply Chain Management.

decision support systems | 2014