IEEE Access | 2019

Time-Series Representation and Clustering Approaches for Sharing Bike Usage Mining

 
 
 

Abstract


Massive bike-sharing systems (BSS) usage and performance data have been collected for years over various locations. Nevertheless, researchers encountered several challenges while dealing with massive BSS data. The challenges that could be enhanced in the previous studies are 1) reducing high dimensionality and noise of BSS time series data and 2) extracting informative usage patterns out of massive BSS data. This paper extracts patterns and reduce data dimensions of BSS usage by exploring time series representation and clustering of BSS usage data. A reduced dimension allows us to efficiently approximate the BSS usage with reasonable accuracy, which can be further used for bike usage clustering, classification and prediction. We employ a non-data adaptive representation technique -Discrete Wavelet Transform (DWT) to reduce dimensionality and filter out random errors of the raw time series. Time series are clustered using k-means based on similarities measured by Dynamic Time Warping (DTW) and prototypes computed using DTW barycenter averaging (DBA). The proposed approaches are applied on a 3-month bike usage dataset acquired on the BSS of Chicago. The analysis results show that DWT can effectively reduce dimensionality, filter out random errors and reveal the main characteristics of the raw time series. The clustering approach offers the ability to differentiate and discover bike usage patterns across different stations.

Volume 7
Pages 177856-177863
DOI 10.1109/ACCESS.2019.2958378
Language English
Journal IEEE Access

Full Text