Balancing bike sharing systems (BBSS): instance generation from the CitiBike NYC data
TTechnical Report
Balancing bike sharing systems (BBSS): instancegeneration from the CitiBike NYC data
Tommaso Urli ∗ [email protected] August 20, 2018
Contents
Bike sharing systems are a very popular means to provide bikes to citizensin a simple and cheap way. The idea is to install bike stations at variouspoints in the city, from which a registered user can easily loan a bike byremoving it from a specialized rack. After the ride, the user may returnthe bike at any station (if there is a free rack). Services of this kind aremainly public or semi-public, often aimed at increasing the attractiveness ofnon-motorized means of transportation, and are usually free, or almost free,of charge for the users.Depending on their location, bike stations have specific patterns regard-ing when they are empty or full. For instance, in cities where most jobs are ∗ Scheduling and Time-Tabling Group , DIEGM - University of Udine, Via delle Scienze206, 33100 – Udine (UD), Italy a r X i v : . [ c s . A I] D ec ocated near the city centre, the commuters cause certain peaks in the morn-ing: the central bike stations are filled, while the stations in the outskirts areemptied. Furthermore, stations located on top of a hill are more likely to beempty, since users are less keen on cycling uphill to return the bike, and oftenleave their bike at a more reachable station. These issues result in substan-tial user dissatisfaction which may eventually cause the users to abandonthe service. This is why nowadays most bike sharing system providers takemeasures to rebalance them. Balancing a bike sharing system is typicallydone by employing a fleet of trucks that move bikes overnight between unbal-anced stations. More specifically, each truck starts from a depot and travelsfrom station to station in a tour, executing loading instructions (adding orremoving bikes) at each stop. After servicing the last station, each truckreturns to the depot.Over the last few years, balancing bike sharing systems (BBSS) hasbecome increasingly studied in optimization [1, 3, 7, 2, 8, 5, 4, 6]. As such,generating meaningful instance to serve as a benchmark for the proposedapproaches is an important task. In this technical report we describe theprocedure we used to generate BBSS problem instances from data of theCitiBike NYC bike sharing system. We employ the instance format defined and popularized by the ADS groupat the Technische Universit¨at Wien (TU Wien) and the mobility departmentat Austrian Institute of Technology (AIT). Note, however, that our scope islimited to the static variant of the problem, as such we only consider a validsubset of the instance format.The format for the static BBSS specifies, for each station s ∈ S • the current number of bikes b s , • the target number of bikes ˆ b s , • the distance from the depot d s,d , and • the distance from each other station k d s,k .Note that this format only describes a state of the bike sharing system,therefore, in a sense, it describes a family of instances. Specific instancescan be generated by specifying • the stations capacities C s , • the vehicles capacities c v , v ∈ V , • the number of vehicles from the depot V , and • the vehicles time budget ˆ t v . Data collection
The first step of instance generation, is gathering a sufficient amount ofusage data about a bike sharing system. Our choice system of choice isCitiBike NYC , since they provide full access, through a web service, to thestate of the network in JSON format at any time . This is an example ofoutput from the web service { "executionTime": "2013-11-04 12:09:01 AM","stationBeanList": [{ "availableDocks": 21,"totalDocks": 39,"longitude": -73.99392888,"testStation": false,"stAddress1": "W 52 St & 11 Ave","stationName": "W 52 St & 11 Ave","landMark": "","latitude": 40.76727216,"statusKey": 1,"availableBikes": 17,..."id": 72},...]} From the output, it can be noted that three pieces of information about ca-pacity are reported: the availableDocks , totalDocks , and the availableBikes .One interesting thing is that totalDocks (cid:54) = availableDocks + availableBikes (1)i.e., there is a displacement of one bike, which is, however, constant throughthe stations. Moreover, a field statusKey encodes the status of the station,e.g., operational or non-operational.We have stored a snapshot of the network every 10 minutes for 6 months(since May 2013 to November 2013). A snapshot, among other information,contains the current and the maximum number of bikes of each station, andits address. This data was necessary in order to provide realistic b s and ˆ b s forevery station. Moreover, we employed the address data to query the Google Web service URL: http://citibikenyc.com/stations/json about the distance, both in minutes and meters, between eachpair of stations s, k ∈ S . Note that this also includes the distances from adepot, as we consider one of the central stations as a depot. We considered the ≈ (cid:48)
000 snapshots, and, for each station, we computedthe distributions of stored bikes at every hour of the day. The week-endswere not considered because the bike usage is much noisier than duringworking days. From the analysis it is clear that, at certain times of the day,there are stations acting as sources (see Figure 1) and others acting as sinks (see Figure 2). The following box plots, that show the distributions of bikeson a single station throughout the day show this behavior.Figure 1: Example of source stationFrom some of the distributions, it was also clear that there was some artificialrebalancing happening overnight between 00 : 00 and 06 : 00 AM (mildlyvisible in Figure 3 at 04 : 00 AM).For each station s ∈ S , we first computed the 1 st and 3 rd quartiles forall the hours of the day. Then we found the minimum first quartile and themaximum third quartile across the day, which we denote, respectively, by min s and max s . Ideally, these are values which we would like to be as faras possible, respectively from 0 (empty station) and from C s (full station).We thus computed a displacement value for each station s , as disp s = (cid:98) C s − ( C s − ( max s − min s )) / (cid:99) − max s (2) Google Maps API: https://developers.google.com/maps disp s brings min s and max s as faras possible from 0 and C s , so that the probability of finding an empty or afull station is minimized. Once the displacement of each station is known, generating an instance froma snapshot is rather easy. But there are some aspects which one should takeinto account. 5 election of the snapshot.
In order for the generated instances to berealistic, one should consider when the rebalancing is likely to be done. Agood guess for this is that the rebalancing happens overnight. This is alsosupported by the rebalancing step which is visible on some stations (e.g.,Figure 3). For our instances, we have chosen midnight as the expected timefor the start of the rebalancing, thus we have used the 30 midnight snapshotsfrom September 2013 as starting points. For each station s ∈ S , the initialnumber of bikes b s is thus the actual number of bikes in the station atmidnight. Selection of the depot.
The information released by CitiBike NYC doesnot contain any data about depots. Of course, this station must be excludedfrom the choice of the other stations to include in the instance. We selectedthe station with CitiBike ID 294 as it was quite centra with respect to allthe other stations.
Selection of stations.
Our generator accepts a size parameter that con-trols the number of stations that are included in the instance. The stationsare then partitioned in sinks and sources and a random station from eachset is added uniformly at randomly to the instance. This strategy tries tobalance sinks and sources, so that the objective function range is broader(and the generated instances are more interesting). Note that, because ofthe randomness in the generation process, the generated instance use, inprinciple, a different set of stations. However, since the number of stationsin the CitiBike NYC sistem is limited ( ≈ We have built an instance generator for BBSS based on the ideas describedin the previous section. The generator is available under the permissive MITlicense at the address https://bitbucket.org/tunnuz/citibike-nyc-generator.Moreover, we have generated 180 instances of increasing size ∈ { , ,. . . , } , which are publicly available as well, under the same license atthe address https://bitbucket.org/tunnuz/citibike-nyc-sept-13. Note that,unlike the instances available from the ADS and AIT websites, the one gen-erated by our software do not already consider the bicycle loading / unload-ing times inside the traveling times. The distances are thus real distances,and the loading and unloading times must be added to the cost function.This allows to implement various policies, e.g., fixed loading / unloadingtimes, or loading / unloading time dependent on the number of transferredbicycles. 6 eferences [1] Mike Benchimol, Pascal Benchimol, Benoˆıt Chappert, Arnaud De laTaille, Fabien Laroche, Fr´ed´eric Meunier, and Ludovic Robinet. Balanc-ing the stations of a self service bike hire system. RAIRO – OperationsResearch , 45(1):37–61, 2011.[2] Daniel Chemla, Fr´ed´eric Meunier, and Roberto Wolfler Calvo. Bike shar-ing systems: Solving the static rebalancing problem.
To appear in Dis-crete Optimization , 2012.[3] Claudio Contardo, Catherine Morency, and Louis-Martin Rousseau.Balancing a Dynamic Public Bike-Sharing System. Technical ReportCIRRELT-2012-09, CIRRELT, Montreal, Canada, 2012.[4] Luca Di Gaspero, Andrea Rendl, and Tommaso Urli. Constraint-basedapproaches for balancing bike sharing systems. In
CP’13 - The 19thInternational Conference on Principles and Practice of Constraint Pro-gramming , pages 758–773. Springer Berlin Heidelberg, 2013.[5] Luca Di Gaspero, Andrea Rendl, and Tommaso Urli. A hybrid aco+cp for balancing bicycle sharing systems. In
HM’13 - The 8th Inter-national Workshop on Hybrid Metaheuristics , pages 198–212. SpringerBerlin Heidelberg, 2013.[6] Marian Rainer-Harbach, Petrina Papazek, Bin Hu, and G¨unther R.Raidl. Balancing bicycle sharing systems: A variable neighborhoodsearch approach. In Martin Middendorf and Christian Blum, editors,
Evolutionary Computation in Combinatorial Optimization , volume 7832of
Lecture Notes in Computer Science , pages 121–132. Springer BerlinHeidelberg, 2013.[7] Tal Raviv, Michal Tzur, and Iris A. Forma. Static repositioning in a bike-sharing system: Models and solution approaches.