An In-Depth Analysis of Ride-Hailing Travel Using a Large-scale Trip-Based Dataset
AAn In-Depth Analysis of Ride-Hailing Travel Using a Large-scale Trip-Based Dataset
Jianhe Du, Ph.D., P.E.
Virginia Tech Transportation Institute 3500 Transportation Research Plaza Blacksburg, VA 24061 Phone: (540) 231-2673 Fax: (540) 231-1555 [email protected]
Hesham A. Rakha, Ph.D., P.Eng. (Corresponding author)
Charles E. Via, Jr. Department of Civil and Environmental Engineering Virginia Tech Transportation Institute Virginia Polytechnic Institute and State University 3500 Transportation Research Plaza Blacksburg, VA 24061 Phone: (540) 231-1505 Fax: (540) 231-1555 [email protected]
Helena Breuer
Virginia Tech Transportation Institute 3500 Transportation Research Plaza Blacksburg, VA 24061 [email protected]
ABSTRACT
With the rapid increase in ride-hailing use, a need to better understand and regulate the industry arises. Conflicting results have been published by researchers and policy makers regarding ride-hailing’s impact on congestion, public transit, and other aspects of traffic systems. One of the obstacles to investing in ride-hailing is the lack of granular operational data over a relatively long period of time. To build an efficient system that can coordinate ride-hailing and other travel modes to better serve travel needs and minimize negative impacts on the transportation system requires enough ride-hailing trip-by-trip data and an understanding of ride-hailing trip patterns. Such patterns include the times that travelers use ride-hailing, where they are traveling from and to, how weekend and weekday ride-hailing trips differ, etc. This paper analyzes a year’s worth of ride-hailing trip data from the Greater Chicago Area, which included detailed time, date, trip length, origin, and destination information to study the ride-hailing trip patterns. More than 104 million trips were analyzed. For trip rates, the results show that the total number of trips remained stable over the year, with pooled trips steadily decreasing from 20% to 9%. People tend to use RH more on weekends compared to weekdays. Specifically, weekend RH trip counts (per day) are, on average, 20% higher than weekday trip counts. The results of this work will help policy makers and transportation administrators better understand the nature of ride-hailing u, Rakha, and Breuer 2 trips, which in turn allows for the design of a better regulation and guidance system for the ride-hailing industry.
INTRODUCTION
Ride-hailing (RH) is defined as an on-demand, app-based, real-time service that provides customers with door-to-door transportation. First introduced to the market by Uber in 2008, RH has been a transportation option in the U.S. for more than a decade now. RH ridership at least tripled in the 3 years from 2016 to 2019, with active Uber users reaching 111 million during the last quarter of 2019 [1]. As of January 2019, 36% of U.S. adults had either used or were currently using RH services [2]. With such a dramatic increase in RH trips, the impacts of this relatively new industry on the transportation network need to be carefully studied, and corresponding policies should be designed to regulate its operation. There have been multiple studies exploring RH trip patterns, impacts, correlation with other modes, optimization of ridesharing paring methodologies, etc. For example, one consideration is the overall change in vehicle miles traveled (VMT) that RH may bring to the transportation system. Some research has concluded that pooling multiple travelers together will help decrease the overall number of cars and the resulting VMT through a pairing and optimization process [3-16]. Other studies, however, have concluded that ridesharing increases the overall mileage traveled by personal vehicles. This increase has been attributed to multiple causes: detour driving for picking up and dropping off passengers, induced travel by individuals who are not be able to drive if no ridesharing service is available, diverted travelers from public transit, etc. [17-25]. Another conflicting argument comes from the relationship of RH with public transit [13]. Some studies believe that RH diverts travelers away from public transit, finding that after the introduction of RH services into the market, transit ridership began to decrease rapidly [15, 17, 23]. Others have found that RH contributes to a better use of the public transit system because it offers first-and last-mile connectivity with public transit or complements public transit when transit runs less frequently or is unavailable [16, 26]. There are also many studies concentrating on optimization methods or pairing modeling that help RH maximize the serving scope with minimum negative impact on congestion [8, 27-32]. Finally, a number of studies have concentrated on RH pricing strategies and revenues [33, 34]. The compound effect of RH on the transportation system will never be a simple question and should be investigated from multiple perspectives. For example, one of the key elements in determining RH’s impact on VMT is calculating which will become the deciding factor: the number of passengers each RH trip can serve or the extra mileage incurred in an RH trip to pick up all passengers. To decide if RH benefits or disbenefits public transit, in-depth examination is needed to comprehensively investigate the time, origin, and destination of RH trips and the corresponding available public transit trip that may be used to replace an RH trip. As shown in Figure 1, RH interacts with mode shift, traveler matching, and routing, and also induces traffic demand. To evaluate the impacts of RH and optimize the efficiency of this trave l mode, a systematic modeling framework that incorporates all the factors is needed. Therefore, as a prerequisite to designing an effective system to coordinate RH with the existing transportation system to enhance RH’s efficiency, we need to start by examining the patterns of RH trips and investigating their temporal and spatial distribution. To do so requires data at a high granularity over a long enough time period. Desirable detailed RH trip information to conduct a comprehensive and systematic analysis would include trip purpose, departure time, origin, and destination. Other factors that need to be considered include trip length, tolerance for (absolute) pick-up/drop-off delay or distance, tolerance for pick-up/drop-off delay with respect to the trip u, Rakha, and Breuer 3 length or time, and tolerance for deviation from preferred departure time. However, since RH is a relatively new traffic mode and the market share has increased so rapidly, there have been limited studies that have collected such large-scale data. In addition, RH companies are typically reluctant to share their data due to privacy issues. Existing literature related to RH typically use the following major data sources: stated or revealed RH survey data, National Household Travel Survey data, large scale taxi data (references), local household travel data, census data, and/or simulation [29, 31, 35-46].
Figure 1 Ridesharing interacting with other factors
It is our belief that any effort to better regulate and improve the operation of the RH industry should be based on a comprehensive understanding of the pattens and variation of RH trips. Until there is collective concurrence about the effect of RH, designers, planners, and politicians cannot make sound decisions. The goal of this study, therefore, was to obtain a large-scale RH database, examine the temporal fluctuation of trips, investigate the patterns of single passenger RH trips (STs) as well as pooled RH trips (PTs), and make recommendations for better administration and regulation polices for the RH industry.
DATASET ACQUISITION
An ideal dataset for this study should include detailed trip-by-trip information, including pick up/drop off locations, departure time, number of passengers (to determine if the trip is a PT or an ST), trip purpose, trip distance and time, passenger demographic data, etc. However, to our knowledge, there is no such complete dataset available. First, RH trip datasets are scarce due to privacy issues and proprietary information concerns at RH companies. Secondly, RH is a relatively new industry. Finally, collecting and cleaning such a complete dataset requires some time. Despite these obstacles, we were able to find a close, if not perfectly matched, candidate dataset for our research. In November 2018, the City of Chicago began collecting trip data from transportation network companies such as Uber and Lyft. The data are published to their open data portal (Chicago Data Portal [47]) and are available to the public. Three datasets are available: (1) trips, (2) drivers, and (3) vehicles. The trips database has 21 fields. Note: To protect traveler privacy, the pickup or dropoff locations for all trips are set to the centroid of the census tract where the trip origin and destination was located. • Trip ID u, Rakha, and Breuer 4 • Pick-up: time, date, location, community area • Drop-off: time, date, location, community area • Trip duration (seconds) and distance • Fare (rounded to the nearest $0.50) • Tip (rounded to the nearest dollar) • Additional charges • Total trip cost • Shared trip (true/false) • Trips pooled • Centroid pick-up: latitude, longitude • Centroid drop-off: latitude, longitude The trips dataset from Chicago is of particular interest because of the granularity of the data. It is the most detailed RH trip database that we are aware of and provides trip information detailed enough to serve the research purposes of this paper.
DATA ANALYSIS
The data from January 1, 2019 to December 31, 2019 were downloaded for use in this paper. Table 1 summarizes the RH trips. The total number of RH trips fluctuated slightly over the year; more trips occurred during the spring, with a light dip in the summer and winter. There was a sharp decreasing trend in the percentage RH PTs, from 20% in January to 9% in December.
TABLE Total Number of Ridesharing Trips by Month (millions) Total Pooled (%) Month Total Pooled (%) Month Total Pooled (%) Jan.
May
Sept.
Feb.
June
Oct.
March
July
Nov.
April
Aug.
Dec.
More RH was used on weekends compared to weekdays. On average, there were 24% more RH trips per day during weekends than during the weekdays. The rate of PTs, however, was higher on weekdays. The percentage of PTs ranged from 18.7% to 8.4% for weekend trips, while the percentage of PTs for weekdays ranged from 21.4% to 8.9%. The average percentage of PTs on weekdays was 4.5% higher than on weekends (Table 2). This pattern repeated itself across different months. The only exceptions occurred during holidays, especially when the holiday connected with a weekend. We found that travelers used RH much less during holiday seasons. Figure 2 and Figure 3 show the number of trips in December and May. u, Rakha, and Breuer 5
TABLE 2 Trips by Weekend and Weekdays (per day, thousands) Month Trips/Weekend Day (% Pooled) Trips/ Weekday (% Pooled) Month Trips/Weekend Day (% Pooled) Trips/ Weekday (% Pooled) January
299 (18.3%) 242 (21.6%)
July
315 (10.9%) 248 (11.8%)
February
336 (18.7%) 269 (21.3%)
August
323 (10%) 252 (11%)
March
349 (17.4%) 276 (20.7%)
September
315 (9.5%) 245 (10.5%)
April
324 (16.2%) 264 (18.8%)
October
333 (8.5%) 256 (9.1%)
May
326 (15.3%) 270 (16.7%)
November
315 (8.4%) 265 (9.4%)
June
332 (12.9%) 266 (14.5%)
December
298 (8.4%) 262 (9.2%)
Figure 2 Typical day-to-day variation in December
Figure 3 Typical day-to-day variation in May
Hourly variations were different for weekdays and weekends. Weekday trips had obvious morning and afternoon peaks, concurring with the typical morning and afternoon commuting peaks. Weekend RH trips peaked at about 6:00–7:00 p.m. and continued at a high level of demand until after midnight. The percentage of PTs for weekends was relatively higher during the day compared to the night. Meanwhile the PTs for weekday trips peaked along with the peak u, Rakha, and Breuer 6 hours. Figure 4 shows the hourly distribution of weekend versus weekday trips for the month of May. The labels in Figure 4 are the percentages of PTs. Hourly variations and PT percentages followed a similar pattern across different months. Weekday trips had a much high PT percentage during the evening peak hour through midnight while weekend trips had a relatively stable PT percentage (Figure 5). As the data shows, the percentage of PTs decreased consistently over the months.
Figure 4 hourly distribution of trips in May u, Rakha, and Breuer 7
Figure 5 PT percentages by weekends and weekdays
For the lengths and travel times, as Figure 6 shows, more than 50% of the RH trips were shorter than 15 minutes and more than 60% were less than 5 miles. Travel times and distance distributions all follow a log-normal distribution, as shown the Q-Q plot in Figure 7. Therefore, the means and standard deviation (SD) for each group (PT versus ST, weekend versus weekday) can be compared directly. Table 3 lists the means and SDs for each trip group. Figure 8 illustrates the cost, travel distance, and time over 12 months. As expected, PTs were, on average, longer both time- and distance-wise than STs, while they cost much less for each passenger. In terms of travel time for STs, both weekdays and weekend trips were slightly longer during spring and summer (March to July) than during the rest of the year. PTs, however, were longer in winter months (October, November, December) for both weekdays and weekends. As the data shows, overall, the means and SD for all groups were relatively larger during spring to autumn. Winter typically had the smallest means in travel times, distances, and costs. The only exception were the PTs from October to December, when trips increased both in means and SDs. STs fluctuated the least in distances for both weekends and weekdays, though they did vary more in the spring in terms of cost and travel times. PTs had a larger mean in travel time and SDs during summer. u, Rakha, and Breuer 8
Figure 6 Travel time and distance in May Figure 7 QQ plot for travel time distribution against the log-normal distribution
X Quantiles Y Q uan t il e s ABLE Means and Standard Deviation Jan Feb March April May June July Aug Sept Oct Nov Dec WD-Pool Cost
WD -Pool Dist
WD -Pool Time
WD -Single Cost
WD -Single Dist
WD -Single Time
Wknd-Pool Cost
Wknd-Pool Dist
Wknd-Pool Time
Wknd-Single Cost
Wknd-Single Dist
Wknd-Single Time Figure Travel cost (per person) time and distance distribution
Table 4 lists the results of two sample T-test results for weekday vs. weekend trips. As the table shows, PTs were not significantly different in cost and travel distance for the two groups. However, STs were significantly different between the two groups, with weekday trips being longer in travel distance or time and more expensive than weekend trips. u, Rakha, and Breuer 11
TABLE 4 Two-Sample T-Test Weekday Vs. Weekend
T (P-value) Difference in Means Pooled Cost -1.417 (0.17) [-0.749, 0.141]
Pooled Dist -0.721 (0.478) [-1.321, 0.639]
Pooled Time
Single Cost
Single Dist
Single Time
CONCLUSION AND DISCUSSION
This paper analyzed a large trip-based ride-hailing dataset collected in Chicago covering the whole year of 2019. The goal of this research was to conduct an in-depth data exploration of RH trips in terms of trip attributes and temporal variations. The results of the study and discussions are summarized as follows. For trip rates, the results show that the total number of trips remained stable over the year, with pooled trips steadily decreasing from 20% to 9%. This appears to indicate that either travelers prefer to travel alone after getting familiar with the RH travel mode or that travelers were not satisfied with the pooled trip experience. One positive contribution of RH to the transportation system is its capability of combining multiple trips such that the total mileage traveled can be decreased. Our results showed that unless effective polices are enforced in the future to encourage this positive contribution by urging passengers to travel together, RH will generate extra mileage within the traffic network. People tend to use RH more on weekends than weekdays. Weekend RH trip counts (per day) are, on average, 20% higher than weekday trip counts. If a holiday is connected with a weekend, however, that trend is no longer true. For the hourly distribution, weekday trips align with typical commuting morning and evening peaks. Weekend trips occur more after 5:00 p.m. through midnight. Among these trips, the rate of pooled trips is consistently higher during the weekdays, with three obvious peaks—(1, 2) morning and evening commuting peaks and (3) midnight—while weekend PT percentages have a relatively smaller fluctuation over the 24 hours. Since no trip purpose information is available in the database, we can only infer from these observations, based on the trip departure times, that travelers tend to use RH more for leisure trips during weekends. Meanwhile, when travelers use RH on weekdays, they are more likely to car-pool with others for commuting and nighttime leisure trips. The time and distance distribution for either weekday or weekend and PTs or STs is similar. RH trips are concentrated in the range of shorter than 15 minutes and less than 5 miles, thus indicating that travelers use RH for short trips. Comparing between weekend and weekday trips, we observe that travel time and distance for STs are significantly different for weekdays and weekends. Specifically, weekday trips are longer than weekend trips in both distance and time, and are therefore more expensive. Our conclusions are compatible with previous studies using stated-preference surveys in that travelers tend to use RH for short leisure trips during weekends and for work trips during weekdays. PTs only account for a small percentage of all RH trips. Further, weekday and weekend trips are statistically different from each other. These conclusions indicate that RH may not be a solution for decreasing traffic congestion and VMT if no other effective regulations are imposed to encourage and reward pooled trips. Meanwhile, polices should be formulated to help u, Rakha, and Breuer 12 the RH industry coordinate with other travel modes to better serve travel needs and minimize the negative impacts of this newly developed travel mode.
ACKNOWLEDGMENTS
This paper is sponsored by The Urban Mobility & Equity Center (UMEC).
AUTHOR CONTRIBUTIONS
The authors confirm contribution to the paper as follows: study conception and design, Jianhe Du and Hesham Rakha; data collection, Jianhe Du and Helena Breuer; data analysis and interpretation of results, Jianhe Du and Hesham Rakha; draft manuscript preparation, Jianhe Du, Hesham Rakha, and Helena Breuer. All authors reviewed the results and approved the final version of the manuscript.
REFERENCES
1. STATISTA.
Monthly number of Uber's active users worldwide from 2017 to 2020, by quarter (in millions) . Social Science Research Network , 2016. 2002: p. 1-29. 4. Alexander, L. and M. Gonzalez. Assessing the impact of real-time ridesharing on urban traffic using mobile phone data . Urban Computing , 2015. August: p. 1-9. 5. Blerim Cici, A.M., Enrique Frias-Martinez, Nikolaos Laoutaris. Assessing the Potential of Ride-Sharing Using Mobile and Social Data -- A Tale of Four Cities.
UBICOMP
Seattle, WA, 2014. 6. Paolo Santi, G.R., Michael Szell, Stanislav Sobolevsky, Steven Strogatz, Carlo Ratti. Quantifying the benefits of vehicle pooling with shareability networks . PNAS , 2014. 111 (37): p. 13290-13294. 7. Shuo Ma, Y.Z., Ouri Wolfson. T-Share: A Large-Scale Dynamic Taxi Ridesharing Services.
ICDE Conference . Brisbane, Australia, 2013. 8. Sheldon Jacobson, D.K. Fuel saving and ridesharing in the US: Movitations, limiations and opportunities . Transportation Research Part D: Transport and Environment , 2009. 14: p. 14-21. 9. Alonso-Mora, J., et al. On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment
PNAS , 2017. 114 (3): p. 462-467. 10. Amey, A.M. Real-Time Ridesharing: Exploring the Opportunities and Challenges of Designing a Technology-based Rideshare Trial for the MIT Community, 2010, MASSACHUSETTS INSTITUTE OF TECHNOLOGY: Boston, Massachusetts. u, Rakha, and Breuer 13 11. Niels Agatz, A.E., Martin Savelsbergh, Xing Wang. Dynamic Ride-Sharing: a Simulation Study in Metro Atlanta. . Berkeley California, 2011. 12. Niels A.H. Agatz, A.L.E., Martin W.P. Savelsbergh, Xing Wang. Dynamic ride-sharing: A simulation study in metro Atlanta . Transportation Research Part B: Methodological , 2011. 45 (9): p. 1450-1464. 13. Contreras, S.D. and A. Paz. The effects of ride-hailing companies on the taxicab industry in Las Vegas, Nevada . Transportation Research Part A: Policy and Practice , 2018. 115: p. 63-70. 14. Murphy, C. and S. Feigon. Shared Mobility and the Transformation of Public Transit, 2016, Shared-Use Mobility Center (SUMC). 15. Rayle, L., et al. Just a better taxi? A survey-based comparison of taxis, transit, and ridesourcing services in San Francisco . Transport Policy , 2016. 45: p. 168-178. 16. Stiglic, M., et al. Enhancing urban mobility: Integrating ride-sharing and public transit . Computers & Operations Research , 2018. 90: p. 12-21. 17. Graehler, M., R. Mucci, and G. Erhardt. Understanding the Recent Transit Ridership Decline in Major US Cities: Service Cuts or Emerging Modes?
TRB 2019 Annual Meeting . Washington, D.C., 2018. 18. Rayle, L., et al. App-Based, On-Demand Rider Services: Comparing Taxi and Ridesourcing Trips and User Characteristics in San Franciso, 2017, University of California Transportation Center (UCTC). 19. Regina Clewlow, G.S.M. Disruptive Transportation: The Adoption, Utilization, and Impacts of Ride-Hailing in the United States, 2017, Institute of Transportation Studies, University of California Davis, California. 20. Schaller, B. THE NEW AUTOMOBILITY: Lyft, Uber and the Future of American Cities, 2018, Schaller Consulting: New York. p. 1-41. 21. Schaller, B. Unsustainable? The growth of app-based ride services and traffic,travel and the future of New York City, 2017, Schaller Consulting: New York. p. 1-38. 22. HENAO, A. IMPACTS OF RIDESOURCING – LYFT AND UBER – ON TRANSPORTATION INCLUDING VMT, MODE REPLACEMENT, PARKING, AND TRAVEL BEHAVIOR, 2017, University of Colorado: Denver, Colorado. 23. Henao, A. and W.E. Marshall. The impact of ride-hailing on vehicle miles traveled . Transportation , 2018. u, Rakha, and Breuer 14 24. H.-S. Jacob Tsao, D.-J.L. Spatial and Temporal Factors in Estimating the Potential of Ride-sharing for Demand Reduction, 1999, CALIFORNIA PATH PROGRAM, INSTITUTE OF TRANSPORTATION STUDIES: Berkeley, CA. p. 1-62. 25. Circella, G., et al. The Adoption of Shared Mobility in California and Its Relationship with Other Components of Travel Behavior 2018, National Center for Sustainable Transportation: Davis, California. 26. Murphy, C. Shared Mobility and the Transformation of Public Transit, 2016, Shared-Use Mobility Center (SUMC). 27. Altshuler, T., et al. Modeling and Prediction of Ride-Sharing Utilization Dynamics . Journal of Advanced Transportation , 2019. 2019: p. 1018. 28. Furuhata, M., et al. Ridesharing: The state-of-the-art and future directions . Transportation Research Part B: Methodological , 2013. 57: p. 28-46. 29. Agatz, N.A.H., et al. Dynamic ride-sharing: A simulation study in metro Atlanta . Transportation Research Part B: Methodological , 2011. 45 (9): p. 1450-1464. 30. Santi, P., et al. Quantifying the benefits of vehicle pooling with shareability networks . Proc Natl Acad Sci U S A , 2014. 111 (37): p. 13290-4. 31. Alonso-Mora, J., et al. On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment . Proceedings of the National Academy of Sciences , 2017. 114 (3): p. 462-467. 32. Berbeglia, G., et al. European Journal of Operational Research . European Journal of Operational Research , 2010. 202: p. 8-15. 33. Bimpikis, K., O. Candogan, and D. Saban. Spatial Pricing in Ride-Sharing Networks . Operations Research . 67 (3): p. 1-72. 34. Banerjee, S., C. Riquelme, and R. Johari. Pricing in Ride-Share Platforms: A Queueing-Theoretic Approach
SSRN , 2015. Februray 35. Clewlow, R.R. and G.S. Mishra. Disruptive Transportation: The Adoption, Utilization, and Impacts of Ride-Hailing in the United States 2017, Institue of Transportation Studies: Davis, California 36. Babar, Y. and G. Burtch. Examining the Impact of Ridehailing Services on Public Transit Use
SSRN , 2017. 37. Zhang, Y. and Y. Zhang. Exploring the Relationship between Ridesharing and Public Transit Use in the United States . International Journal of Environmental Research and Public Health , 2018. 15 (8): p. 1-23. 38. Barann, B., D. Beverungen, and O. Müller. An open-data approach for quantifying the potential of taxi ridesharing . Decision Support Systems , 2017. u, Rakha, and Breuer 15 39. Lo, J. and S. Morseman. The Perfect uberPOOL: A Case Study on Trade-Offs . Ethnographic Praxis in Industry Conference Proceedings , 2018. 2018 (1): p. 195-223. 40. Hu, H.-H., et al. Exploring the Methods of Estimating Vehicle Miles of Travel.
Western Regional Science Association 51st Annual Meeting . 2012. 41. Clewlow, R. Shared-Use Mobility in the United States: Current Adoption and Potential Impacts on Travel Behavior.
TRB Annual Meeting . Washington, D.C., 2015. 42. Clewlow, R.R. Carsharing and sustainable travel behavior: Results from the San Francisco Bay Area . Transport Policy , 2016. 51: p. 158-164. 43. Wang, Y., R. Kutadinata, and S. WInter. Activity-based ridesharing: Increasing flexibility by time geography. . San Francisco, CA, 2016. 44. Hoffman, K. and P.G. Ipeirotis. Ridesharing and the Use of Public Transportation.
Thirty Seventh International Conference on Information Systems . Dublin, Ireland, 2016. 45. Dopplet, L. NEED A RIDE? UBER CAN TAKE YOU (AWAY FROM PUBLIC TRANSPORTATION), 2018, Georgetown University: Washington, D.C. 46. Leard, B., J. Linn, and C. Munnings. Explaining the Evolution of Passenger Vehicle Miles Traveled in the United States, 2016: Resources for the Future. 47. Chicago.
Chicago Data Portal . 2020; Available from: https://data.cityofchicago.org/.. 2020; Available from: https://data.cityofchicago.org/.