[PDF] Analyzing and comparing door-to-door travel times for air transportation using aggregated Uber data

Abstract

Improving the passenger air travel experience is one of the explicit goals set by the Next Generation Air Transportation System in the United States and by the Advisory Council for Aeronautics Research in Europe FlightPath 2050. Both suggest door-to-door travel times as a potential metric for these objectives. In this paper, we propose a data-driven model to estimate door-to-door travel times and compare the reach and performance of different access modes to a city, as well as conduct segment analysis of full door-to-door trips. This model can also be used to compare cities with respect to the integration of their airport within their road structure. We showcase multiple applications of this full door-to-door travel time model to demonstrate how the model can be used to locate where progress can be made.

Full PDF

11 Analyzing and comparing door-to-door travel timesfor air transportation using aggregated Uber data

Philippe Monmousseau, Aude Marzuoli, Eric Feron and Daniel Delahaye

Abstract —Improving the passenger air travel experience isone of the explicit goals set by the Next Generation AirTransportation System in the United States and by the AdvisoryCouncil for Aeronautics Research in Europe FlightPath 2050.Both suggest door-to-door travel times as a potential metricfor these objectives. In this paper, we propose a data-drivenmodel to estimate door-to-door travel times and compare thereach and performance of different access modes to a city,as well as conduct segment analysis of full door-to-door trips.This model can also be used to compare cities with respect tothe integration of their airport within their road structure. Weshowcase multiple applications of this full door-to-door traveltime model to demonstrate how the model can be used to locatewhere progress can be made.

I. I

NTRODUCTION

Both in Europe and in the United States, national or supra-national agencies promote the need for seamless door-to-doortravel and data sharing. They were deemed as needed bythe European Commission’s 2011 White Paper [1] and werereconﬁrmed by the Federal Aviation Administration (FAA)in 2017 [2]. Data sharing was already a main focus in theearly 2000s; in response, Europe created and adopted SWIM- System Wide Information Management [3] - and the FAAfollowed suit. The Next Generation Air Transportation System(NextGen) [4] in the United States and the Advisory Councilfor Aeronautics Research in Europe (ACARE) Flightpath 2050[1] both aim to have a more passenger-centric approach. Tothis end, ACARE Flightpath 2050 sets some ambitious goals,which are not all measurable yet due to lack of available data.For example, it aims at having 90% of travelers within Europebeing able to complete their door-to-door journey within 4hours. In the US, the Joint Planning and Development Ofﬁcehas proposed and tested metrics regarding NextGen’s goals [5],but the passenger-centric metrics, especially regarding door-to-door travel times, are still missing.Cook et al. [6] ﬁrst explored the shift from ﬂight-centricto passenger-centric metrics in the project POEM - PassengerOriented Enhanced Metrics - where they propose propagation-centric and passenger-oriented performance metrics and com-pare them with existing ﬂight-centric metrics. Later, Laplaceet al. [7] introduce the concept of Multimodal, EfﬁcientTransportation in Airports and Collaborative Decision Making(META-CDM); they propose to link both airside CDM andlandside CDM, thus taking passenger perspective into account.In this perspective, Kim et al. [8] propose an airport gatescheduling model for improved efﬁciency with a balancebetween aircraft, operator and passenger objectives. Dray et al.[9] illustrate the importance of multimodality by considering ground transportation as well during major disturbances of theair transportation system in order to offer better solutions topassengers.The estimation of door-to-door travel time for multi-modaltrips has been previously studied, but for trips contained withinthe same metropolitan area. Peer et al. [10] focus on commuteswithin a Dutch city by studying door-to-door travel times andschedule delays for daily commuters, and show that, for theestimation of the overall travel time, it is important to considerthe correlation of travel times across different road links.Salonen and Toivonen [11] investigate the need for comparablemodels and measures for trips by car or public transport withfocus on the city of Helsinki. Their multi-modal approachtakes into account the walking and waiting times necessaryto reach a station or a parking place. Duran-Hormazabal andTirachini [12] analyze travel time variability for multi-modaltrips within the city of Santiago, Chile, using human surveyorsand GPS data to estimate the time spent in the differenttransportation modes, namely walking, driving a car, riding abus and taking the subway. Pels et al. [13] analyze the relativeimportance of access time to airports in the passengers’ choicewithin the San Francisco Bay Area based on a passengersurvey, offering perspective from air transportation. Theseworks emphasize the importance of considering all relevantmodes when estimating door-to-door travel times, but arelimited in scope with respect to the size of the area consideredand the amount of data available.Thanks to the increasing use of mobile phones as datasources, larger scale studies with a focus on air transportationhave been possible. In the United States, Marzuoli et al.[14] implement and validate a method to detect domesticair passengers using mobile phone data available on a na-tionwide scale. Though the main focus of this work is thepassenger behavior at airports, the granularity of the datafacilitates analysis of each phase within the full door-to-doortrip. Marzuoli et al. [15] then combine mobile phone datawith social media data to analyze passenger experiences inairports during a major disruptions. In Europe, within the Big-Data4ATM project , Garcia-Albertos et al. [16] also presenta methodology for measuring door-to-door travel times usingmobile phone data, illustrated through trips between Madridand Barcelona. Mobile phone data are, however, proprietarydata, which are difﬁcult to access for research.Grimme and Martens [17] propose a model analyzing thefeasibility of the 4-hour goal proposed by FlightPath 2050based on airport-to-airport ﬂight times and a simpliﬁed model a r X i v : . [ c s . OH ] J a n of access to and from airports. Sun et al. [18] use open sourcemaps and datasets to calculate door-to-door minimum traveltime estimations in order to study the possible competitivenessof air taxis.In the upcoming sections, the model and analysis presentedare also based on already available online data but with apost operation approach. The aim of this model is to create amethod to measure the average door-to-door travel times oncetrips are completed to analyze and compare available modes oftransportation. We have applied the ﬁrst version of this methodto two intra-European multi-modal trips, thus comparing airtransportation and rail transportation [19]. We then used animproved version of the same method, leveraging four differentdata sources (ride-sharing, ﬂight, phone, and census data) andadapted to the conditions in the United States, to compare tripsusing direct ﬂights between ﬁve US cities, three of them onthe West Coast and the other two on the East Coast [20].In this paper, we offer a data-driven model for the com-putation of door-to-door travel times that harnesses recentlyavailable data along with public data. The data-driven meth-ods developed can be applied for most multi-modal tripsbetween two cities where relevant data are available and arenot limited to the air transportation system. The range ofnew analyses available using this model is illustrated withmultiple modal analysis of an intra-European trip, a per-leganalysis of multiple intra-USA trips and an analysis of theimpact of severe weather disruptions. These analyses havedirect applications for passengers, urban planners and decisionmakers and highlight the difference between taking a ﬂight-centric approach to the air transportation system and taking apassenger-centric approach.Section II of this paper presents the data-driven, full door-to-door travel time model; Section III showcases a ﬁrst set ofanalyses and applications facilitated by this model for tripsbetween Amsterdam and Paris; and, Section IV focuses on aset of analyses for trips within the United States where moredata are available; Finally, Section V concludes this paper andproposes future research directions.II. T HE FULL DOOR - TO - DOOR DATA - DRIVEN MODEL

Similarly to [16] and [18], we can deconstruct the traveltime T for trips with direct ﬂights or direct train links into ﬁvedifferent trip phases, represented in Figure 1 and summarizedin equation (1), T = t to + t dep + t in + t arr + t from , (1)where • t to is the time spent traveling from the start of the journeyto the departure station (e.g. train station or airport), • t dep is the time spent waiting and going through securityprocesses (if any) at the departure station, • t in is the time actually spent in ﬂight or on rails, • t arr is the time spent at the arrival station (e.g. goingthrough security processes), • t from is the time spent traveling from the arrival stationto the ﬁnal destination. Departurestation Arrivalstation t to t dep t in t arr t from Figure 1: Model of the full door-to-door travel time.The full model for door-to-door travel time proposed in thispaper is established by data-driven methods used to calculatethe values of the different times contained in equation (1).These data-driven methods are described in Sections II-Athrough II-D.This study focuses on air and rail transportation as maintransportation modes, which give the value of t in , though theprocess can also be applied to inter-city bus trips. Furthermore,it is assumed that passengers travel by road when arrivingor leaving the main station (airport or train station) for thecalculation of t to and t from . In response to data availability,the case studies only consider six major US cities (Atlanta,Boston, Los Angeles, Seattle, San Francisco and WashingtonD.C.) and two European capitals (Amsterdam and Paris). A. Travel time from the origin location to the departure stationand from the arrival station to the ﬁnal destination

We can estimate the road transit times from origin locationto departure station ( t to ) and from arrival station to ﬁnaldestination ( t from ) by using aggregated and publicly availabledata from taxi or ride-sharing services. Uber [21] is a ride-sharing service launched in 2010 and located in major urbanareas on six continents; it has recently released anonymizedand aggregated travel time data for certain of the urban areaswhere it operates. The available data consist of the average,minimum and maximum travel times between different zones(e.g. census tracts in the case of US cities) within servicedarea from all Uber rides aggregated over ﬁve different periodsfor each considered day. The ﬁve considered periods, usedthroughout this study, are deﬁned as follows: • Early Morning: from midnight to 7am • AM: from 7am to 10am • Midday: from 10am to 4pm • PM: from 4pm to 7pm • Late Evening: from 7pm to midnightThere are days when the travel times between some zones areonly aggregated at a daily level. Travel times are associatedwith their mean starting door time, i.e. the mean of all thetime stamps from the trip contained in the zone of departure.Since Uber was initially introduced in the US, the impact ofUber in US urban transit has already been the focus of severalstudies prior to this data release. Li et al. [22] concludes that,at an aggregated level, Uber tends to decrease congestion inthe US urban areas where it was introduced. Later, Erhardt etal. [23] build a model showing that ride sharing companies doincrease congestion using the example of San Francisco. Hallet al. [24] focus on whether Uber complemented or substitutedpublic transit by studying the use of public transit systembefore and after Uber’s entry date in different US cities. Wang and Mu [25] study Uber’s accessibility in Atlanta, GA (US) byusing the average wait time for a ride as a proxy and concludethat the Uber use is not associated to a speciﬁc social category.Following the release of Uber data, Pearson et al. [26] proposea trafﬁc ﬂow model based on this aggregated Uber data anduse it to analyze trafﬁc patterns for seven cities world-wide.Assuming Uber rides as part of the road trafﬁc ﬂow, this studyconsiders that Uber’s travel times are an acceptable proxy ofthe actual travel times by road. In cities where busses don’thave speciﬁc road lanes, these travel times are a valid proxyfor both car and bus trips. This paper limits its scope to roadaccess to and egress from considered stations. The analysis ofsubway alternatives is not considered in this paper.Each US city is divided into their census tracts; Parisinto the IRIS zones used by INSEE [27] for census, andAmsterdam into its ofﬁcial districts called wijk . B. Dwell time at stations

The dwell time at a station, either t dep or t arr , is deﬁned asthe time spent at the station, whether going through securityprocesses, walking through the station, or waiting. The timespent at each station depends on the mode considered, the spe-ciﬁc trip, and whether the passenger is departing or arriving.The dwell time at departure can be split into two components, t dep = t sec + t wait . , (2)a processing time, t sec , necessary to get through security (ifany) and through the station to the desired gate or track, andan extra wait time, t wait , due to unanticipated delays.Processing times at US airports are based on the averagewait times at airports extracted from the study of Marzuoliet al. [14]. The six US airports under study in this paperare: Hartsﬁeld-Jackson Atlanta International Airport (ATL),Boston’s Logan International Airport (BOS) and Ronald Rea-gan Washington National Airport (DCA) for the East Coast,Los Angeles International Airport (LAX), Seattle-TacomaInternational Airport (SEA) and San Francisco InternationalAirport (SFO) for the West Coast. Processing times at Eu-ropean airports are assumed invariant between airports anddetermined using most airline recommendations. The threeEuropean airports under study are: Paris Charles de GaulleAirport (CDG), Paris Orly Airport (ORY) and AmsterdamAirport Schiphol (AMS).The average dwell times at these airports are summarized inTable I for US airports and in Table II for European airports.Table I: Average dwell time spent at US airports in minutes. ATL BOS DCA LAX SEA SFO

Time at departure 110 105 100 125 105 105Time at arrival 60 40 35 65 50 45

Table II: Average dwell time spent at European airports inminutes.

AMS CDG ORY

Time at departure 90 90 90Time at arrival 45 45 45

With regard to processing times at train stations, based onthe recommendation of the train station websites, the departure dwell time is set at 15 minutes and the arrival dwell time isset at 10 minutes for all train stations. We can improve theseestimates by gathering data from GPS or mobile phone sourcesas well as WiFi beacons within airports and train stations, andby using a method similar to Nikoue et al. [28].We can calculate the extra wait times when the scheduledand actual departure or arrival times are available. For US air-ports, these wait times are calculated only for departure usingthe publicly available data from the Bureau of TransportationStatistics (BTS) [29]. They were obtained by subtracting thescheduled departure time from the actual ﬂight departure time.

C. Time in ﬂight or on rail1) US ﬂights:

The actual ﬂight time was calculated basedon the data from BTS using the actual departure/arrival timesof all direct ﬂights between each city pairs from January 1 st st

2) European trips:

In Europe, we assume that ﬂights andtrains are on time and follow a weekly schedule, due to alack of publicly centralized ﬂight schedule data. The weeklyschedules are extracted from actual train and ﬂight schedulesgathered over a period of several months and are assumedapplicable over the full period under study.

D. Full door-to-door travel time

Our model assumes that travelers plan their departure timeto arrive at the departure station exactly t sec minutes (eq. (2))before the scheduled departure time of their ﬂight or train.We use this assumption to determine the value of t to since itdeﬁnes the period of the day to consider when extracting theUber average time from the origin location to the departurestation. We extract the value of t from by using the actual arrivaltime of the ﬂight or train. When only daily aggregated timesare available in the Uber data, these times are used for eachperiod of the day in proxy.III. F LIGHTS VERSUS TRAINS : A COMPARISON OFDIFFERENT ACCESS MODES TO P ARIS

Let us consider a traveler leaving from Amsterdam citycenter to reach the Paris area. We have chosen the citycenter of Amsterdam as it covers both tourists and businesstravelers, but the proposed door-to-door travel time modeland subsequent analysis can be applied from any zone. Threepossible means of transportation are under study for this trip:plane from Amsterdam Airport Schiphol to Paris Charles DeGaulle airport (CDG) or via Paris Orly (ORY), or train fromAmsterdam Centraal via Paris Gare du Nord (GDN).

A. Flight and train schedules

As in Section II-C2, the ﬂight and train schedules used forthis study are weekly schedules. The weekly ﬂight schedulesbetween Amsterdam Airport Schiphol (AMS) and CDG orORY were extracted from the actual ﬂight schedules fromDecember 2019 to January 2020, and it was assumed thatthe obtained weekly schedules would run from January 1 st th Mo Tu We Th Fr Sa Su Ams. Paris x x x x 10:25 11:45x 14:45 16:05x x x x x 18:50 20:10x 19:40 21:00

Table IV: Extracted weekly schedule from Amsterdam to Parisvia CDG.

Mo Tu We Th Fr Sa Su Ams. Paris x x x x x x x 06:50 08:10x x x x x x x 07:20 08:45x x x x x x x 08:10 09:35x x x x x x x 09:30 10:55x x x x x x x 10:20 11:45x x x x x x x 12:25 13:40x x x x x x x 13:55 15:15x x x x x x x 14:50 16:10x x x x x x x 16:35 17:50x x x x x x x 17:45 19:00x x x x x x 19:10 20:30x x x x x x x 20:25 21:45

The weekly train schedule between Amsterdam Centraalstation and Paris Gare du Nord (GDN) is similarly extractedfrom the actual train schedule of the year 2019 and applied tothe year 2018. It is summarized in Table V. Night trains arenot considered for this study.Table V: Extracted weekly schedule from Amsterdam to Parisvia GDN.

Mo Tu We Th Fr Sa Su Ams. Paris x x x x x x 06:15 09:35x x x x x x x 07:15 10:38x x x x x x x 08:15 11:35x x x x x x 09:15 12:35x x 10:15 13:38x x x x x x x 11:15 14:35x x x x x x x 13:15 16:38x x x x x x 14:15 17:35x x x x x x x 15:15 18:35x x x x x x 16:15 19:35x x x x x x 17:15 20:35x x x x x x x 18:15 21:38x x x x x x 19:15 22:35x 20:15 23:38

These schedules already highlight the major differencesbetween the three considered modes. The ﬂight schedulethrough ORY contains the fewest possibilities, limited to twoﬂights daily, whereas the other two models offer hourly sched-uled transport. Another notable difference can be seen withrespect to the station-to-station travel times: ﬂights betweenAmsterdam and Paris (both CDG and ORY) take 1h20 ( ± ± B. Average total travel time mode comparison

The proposed data-driven door-to-door model can be usedto evaluate and compare the range of each considered mode, which helps to understand better the urban structure andbehavior from a transportation point of view.For each trip τ (ﬂight or rail) over the period D from Jan-uary 1 st th ¯ T ( τ ) , ¯ T ( τ ) = ¯ t to ( τ ) + t dep ( τ ) + t in ( τ ) + t arr ( τ ) + ¯ t from ( τ ) , (3)where ¯ t to ( τ ) is the average ride time between Amsterdam citycenter and the departure station (AMS or Amsterdam Centraal)for the trip τ , ¯ t from ( τ ) is the average ride time from the arrivingstation (CDG, ORY or GDN) to the ﬁnal arrival zone for thetrip τ .The same daily periods as those used in the Uber data(see Section II-A) are considered here to categorize the tripsinto ﬁve groups depending on the time of arrival at the ﬁnaldestination. For each day d and each period p , the meanper arrival zone z of the average door-to-door travel timesis calculated for each mode m , E d,pz ( m ) = 1 |T d,pz | (cid:88) τ ∈T d,pz ¯ T ( τ ) , (4)where T d,pz is the set of ﬂight and rail trips that end at zone z on day d and period p .For each day d , each period p and each arrival zone z , themode with the shortest mean travel time E d,pz ( m ) is kept. Thenumber of times N pz ( m ) a mode m has had the shortest meantravel time is counted for each zone z and for each period p over the twenty-one month period D , N pz ( m ) = |{ d ∈ D | m = arg min m E d,pz ( m ) }| . (5)This distribution of modes over the different zones canhelp travelers choose the mode of transport that is best suiteddepending on the desired arrival zone and on the desired timeof arrival. It can also help urban planners to better understandthe road network linking the different stations to the city.Figure 2 shows the fastest mode to reach the different zonesin the Paris dataset for the ﬁve different periods of the daysused by the Uber dataset. For each zone z and each period p , the fastest mode associated is the mode m having thehighest N pz ( m ) , i.e. the highest number of days with the lowestaverage total travel time over the considered date range. Thezones best reached through CDG are indicated in blue, ORYin red and GDN in green.We can draw several conclusions from these maps. Theabsence of zones best reached by plane via ORY (in red) isparticularly noticeable in the morning period: An importantarea of South-West Paris is not reached by Uber rides neitherfrom GDN nor from CDG. These maps would advocate foran increase in frequency for the AMS-ORY ﬂights from atraveler’s perspective.From a structural perspective, the interstate linking Paris toCDG is visible on all three maps since it enables travelersthrough GDN to reach zones close to CDG faster than if theyﬂew to CDG directly. The perimeter highway circling Paris isalso a major aid to GDN and is visible on the maps wherethere is an important competition between GDN and CDG.The section of the perimeter highway farthest from GDN (i.e. (a) Morning (AM)(b) Midday Figure 2: Comparison of the average total travel times tothe Paris area between the three considered arrival stations(CDG: blue, ORY: red, GDN: green) for a trip starting fromAmsterdam city center for different trip termination periods.in the south-west of Paris) is, however, overtaken by eitherairport depending on the period of the day. The rest of GDNinﬂuence zone is fairly invariant from a period to another.Using a similar map representation for trips ending in theafternoon but not shown here for space considerations, ORY’srange is limited during the afternoon, with CDG taking oversome zones close to ORY. This is essentially due to the limitednumber of ﬂights landing in the afternoon (one per week, everyFriday) compared to the daily arrival of CDG ﬂights.

C. Average total travel time distribution analysis

Once the fastest mode to reach each zone is determined,it is possible to analyze the fastest full door-to-door traveltime for each zone. This approach gives an overview of thelevel of integration of airports, train stations, and road structureand can indicate zones that are less reachable than others andwould thus require more attention from urban planners.The fastest full door-to-door travel time associated with azone z at a period p is calculated as the average fastest travel time to reach zone z at period p across all modes and overthe period D , ¯ E pz = 1 |D| (cid:88) min m E d,pz ( m ) . (6)Figure 3 displays the fastest full door-to-door travel times { ¯ E pz } p,z to reach the different zones in the Paris dataset fortrips ﬁnishing in the morning or at midday. The color scale thatindicates the fastest full door-to-door travel times is identicalin all subﬁgures. The contour of each zone indicates the fastestmode to reach it according to the results from Section III-Busing the same color code as Figure 2, i.e. the zones reachedfaster through CDG are surrounded in blue, ORY in red andGDN in green.For a better comparison, the distribution of the number ofzones per period reached within four time intervals (less than4 hours, between 4 hours and 4 hours and 30 minutes, between4 hours and 30 minutes and 5 hours, and more than 5 hours)is presented in Table VI.Table VI: Number of zones per mode and period of the daygrouped by full door-to-door travel time intervals. The originaldataset is the same as that used to generate Figure 3. Mode Time interval Early AM Midday PM Late

CDG t ≤

4h 0 4 6 5 114h < t ≤ < t ≤

5h 0 653 433 498 384 t >

5h 187 0 15 13 11GDN t ≤

4h 398 247 247 290 3144h < t ≤ < t ≤

5h 0 14 8 6 8 t >

5h 0 0 0 0 0ORY t ≤

4h 0 0 0 0 04h < t ≤ < t ≤

5h 0 0 397 259 425 t >

5h 0 0 0 0 1

We see a dissymmetry between the north and the south ofParis by looking only at the time color scale in Figure 3. Thenumber of green zones, i.e. zones reachable in less than 4hours and 20 minutes, and the surface covered by these greenzones are more important in the northern part of Paris than inthe southern part of Paris, including in areas close to Paris Orlyairport. Combining this observation with the contour colorof each zone suggests that Paris Orly airport is not as wellintegrated in the Parisian road structure as Paris Charles deGaulle airport, which can make it less attractive for travelersdesiring to travel by air and thus less competitive.We can complete a more quantitative analysis from Ta-ble VI, with some of the main ﬁndings listed here: • Only 10% of the arrival zones are reached in less than 4hours from Amsterdam city center. • Zones that can be reached in less than 4 hours fromAmsterdam city center are overwhelmingly reached bytrain through Paris Gare du Nord (98%). • A trip from Amsterdam city center to Paris going throughORY always takes more than 4 hours. •

78% of the arrival zones are reached in less than 4hours and 30 minutes from Amsterdam city center whencombining all three possible modes. (a) Morning (AM) (b) Midday

Figure 3: Comparison of the fastest full door-to-door travel times to the Paris area between the three considered arrival stations(CDG: blue, ORY: red, GDN: green) for a trip starting from Amsterdam city center for different trip termination periods. Thecontour color of each zone indicates the fastest mode to reach it.Therefore, we can use these results to assess how well the4-hour goal from ACARE FlightPath 2050 is engaged.

D. Reliability issues

The proposed full door-to-door travel time model assumesthat passengers choose their departure time in order to arriveexactly t sec before the scheduled departure of their plane ortrain and that they also know how long it takes to reach thedeparture station. However, in reality, there is an uncertaintyin the time the traveler will spend reaching the airport andin airport processing times. This uncertainty often leads to anadditional buffer time implying an earlier departure time forthe traveler. Using the presented model with the available data,we can ﬁnd the most reliable mode to use per arrival zone.The most reliable mode for a given arrival zone is deﬁned asthe mode with the lowest variability in travel time, i.e. themode where the difference between the maximum travel timeand the minimum travel time to reach that zone is the lowest.This comparison is useful for passengers or trips that requirean accurate arrival time rather than a minimum travel time.Figure 4 shows the most reliable mode on average to reachthe different zones in the Paris dataset for trips ﬁnishing inthe morning or at midday. As for the previous analysis, theperiod was determined using the departure time of the fulldoor-to-door trip and uses the same color code, i.e. the zonesreached most reliably through CDG are indicated in blue, ORYin red and GDN in green. For each zone and each period ofthe day, the most reliable mode associated is the mode havingthe highest number of days with the lowest average variabilitytravel time over the considered date range.Though Figure 4 and Figure 2 are similar, there are somemajor differences between average efﬁciency and averagereliability. For example, though it is on average faster to reachthe zones close to the highway leading to CDG by train,after 10:00 it is safer from a time variability perspective toreach them via CDG. From a reliability perspective, CDG has claimed the quasi totality of the zones surrounding it, exceptin the early morning where trips through GDN are still better.When we compare all three modes using this metric, it appearsthat GDN has the greatest decrease in competitiveness, withits range smaller than when considering average travel times. E. Impact of faster processing times

The major difference between air and rail travel is thenecessary processing time both at departure and at arrival. Inthis particular study, with a ﬂight time of about 80 minutes,the current assumption of a departure processing time of90 minutes implies that travelers spend more time at theirdeparture airport than in ﬂight, which greatly impacts therapidity of air travel. With the presented model, one canmodify these assumed processing times in order to studythe impact of improving these times both from an airportperspective and a passenger perspective. Let’s assume thatthe processing time at airports is improved from 90 to 60minutes at departure and from 45 to 30 minutes at arrival.These modiﬁcations could be achieved in reality consideringthat this is an intra-Schengen trip and that there isn’t anyborder control.Figures 5& 6 show which is the fastest mode on average toreach the different zones in the Paris dataset for trips arrivingat destination early in the morning or at midday.The ﬁrst major difference with this processing time im-provement can be seen for trips arriving in the early morning(Figure 5): all zones previously reached through CDG are nolonger accessed at this period since they were associated to the21:45 ﬂight of the previous day. This indicates that all tripsthrough CDG start and end on the same day, with no tripsﬁnishing after midnight.Looking at trips arriving at midday (Figure 6), trips throughCDG are greatly advantaged by this time improvement, withCDG taking over more than half of GDN’s previous inﬂuencezone. This range increase from CDG can be explained both (a) Morning (AM)(b) Midday

Figure 4: Comparison of the average variability of travel timesto the Paris area between the three considered arrival stations(CDG: blue, ORY: red, GDN: green) for a trip starting fromAmsterdam city center for different trip termination periods.by faster door-to-door travel times and by the increase of tripsthrough CDG arriving at midday (rather than in the afternoon).As a matter of fact, besides in the early morning, GDN losesits competitiveness against both airports, with its range greatlyshrinking in size. The competition between CDG and ORYremains unchanged, which is understandable since they bothreceived the same processing time improvement.A quantitative analysis similar to the analysis presented inTable VI concludes that all trips are now conducted in less thanﬁve hours and that 99.8% of the zones reachable are reachedin less than 4h30. ORY sees some major improvements with97.5% of the zones best reached through it reached in lessthan four hours (compared to no trips in less than 4h in theinitial model), while increasing the number of zones it reachesthe fastest.Using a map representation similar to Section III-C, but notpresented here due to space considerations, it is possible tonotice a 20-30 minutes shift in the time distribution for every (a) Early morning with faster processing times(b) Early morning with normal processing times

Figure 5: Comparison of the average total travel times to theParis area assuming faster airport processing times between thethree considered arrival stations (CDG: blue, ORY: red, GDN:green) for a trip starting from Amsterdam city center for tripsarriving at destination early in the morning. The correspondingmap with normal processing times is also reproduced.period except for early morning trips since train processingtimes were unchanged. The upper bound travel time is alsounchanged for trips arriving in the morning, which wouldindicate that for some zones, the processing time improvementresulted in no improvement or even a worsening of the fulltrip travel time. Besides that exception, in this case a 45minutes improvement in airport processing time leads onlyto a maximum of 30 minutes of average total travel timeimprovement due to the inﬂuence of train trips through GDN.IV. A

MULTI - MODAL ANALYSIS OF THE US AIRTRANSPORTATION SYSTEM

Additional insights are gained from this full door-to-doortravel time model thanks to the availability of complementarydata. The United States is a federal state the size of a continent,therefore various aggregated and centralized datasets are more (a) Midday with faster processing times (b) Midday with normal processing times

Figure 6: Comparison of the average total travel times to the Paris area assuming faster airport processing times between thethree considered arrival stations (CDG: blue, ORY: red, GDN: green) for a trip starting from Amsterdam city center for tripsarriving at destination at midday. The corresponding maps with normal processing times are also reproduced.easily available to all. Several of these datasets are used in thissection to add applications to the presented full door-to-doormodel. This US study is limited to the period from January1st 2018 to March 31st 2018.

A. Flight schedule

As presented in the model deﬁnition in Section II-C1, boththe scheduled ﬂight times and the actual ﬂight schedules ofmost domestic ﬂights can be obtained via the Departmentof Transportation Bureau of Transportation Statistics (BTS)[29]. This study considers only the six US airports presentedin Section II-B, three East-coast airports - Hartsﬁeld-JacksonAtlanta International Airport (ATL), Boston’s Logan Interna-tional Airport (BOS) and Ronald Reagan Washington NationalAirport (DCA) - and three West-coast airports - Los AngelesInternational Airport (LAX), Seattle-Tacoma International Air-port (SEA) and San Francisco International Airport (SFO).During this three-month period, 38,826 ﬂights were consid-ered, which corresponds to 3,523 early ﬂights, 8,170 morningﬂights, 13,451 midday ﬂights, 6.695 afternoon ﬂights and6,987 evening ﬂights. The full door-to-door travel times werethen calculated for each scheduled ﬂight from January 1 st ,2018 to March 31 st , 2018, using the model presented inSection II. B. Leg analysis

We can use the full door-to-door model to better understandthe time spent in each leg proportionally to the time spent onthe overall trip. For each trip, we calculate the percentage oftime spent at each phase based on the full door-to-door traveltime. We then calculate the average percentage time spent foreach phase and for each city pair trip. Figure 7 shows the barplot of these average percentage times for the thirty consideredcity pairs. The city pairs are sorted according to the percentageof time spent in the actual ﬂight phase. With the proposed full Figure 7: Bar plot of the average proportion of the time spentwithin each phase of the full door-to-door journey for all thirtyconsidered trips.door-to-door model, for all considered trips, passengers spendon average more time at the departure airport than riding to andfrom the airports. This ﬁgure also shows that, with this model,for some short-haul ﬂights, such as between SFO and LAX orbetween BOS and DCA, passengers spend on average moretime at the departure airport than in the plane. Reﬁning the fulldoor-to-door model by considering tailored airport processingtimes t sec at departure depending on the city pair and not onlyon the departure airport could lead to a different conclusion.However, this modiﬁcation of the model would require moreaccess to passenger data. C. Airport integration comparison

The proposed full door-to-door model allows us to compareeach airport’s integration within its metropolitan area. Each census tract is associated with an internal point within itsboundaries, and this internal point can be used to automaticallycalculate the distance between airports and each census tractof their metropolitan area. The internal points were deﬁnedusing an online database based on the US government 2010census. Figure 8 shows the scatter plot of the average daily ridetime to each airport versus the geodesic distance to the airportfor the six considered airport. The geodesic distance is theshortest distance between two points along the surface of theEarth. Additionally, the plot also ﬁgures a linear regression ofthese average time with respect to the distance to the airport.A steeper slope for the linear regression indicates that it takeslonger to reach the airport from a given distance.Figure 8: Scatter plot of the average ride time to the airport t to versus the distance to the airport from January 1 st st D. Impact of severe weather analysis

Using the same door-to-door travel time visualization pro-cess and applying it to different days can be a tool to betteranalyze the effects of severe weather perturbations on the fulldoor-to-door journey. As an example, the winter storm previ-ously studied in [15] is analyzed for trips between WashingtonD.C. and Boston using Figure 9. This winter storm hit theEast Coast of the United States on January 4 th (a) Before landfall, on January 2 nd , 2018(b) After landfall, on January 5 th , 2018 Figure 9: Average door-to-door travel times from WashingtonD.C. city hall to Boston over a single day, before and afterthe Bomb Cyclone of January 2018.from Washington D.C. city hall on January 2 nd th nd th th th nd by the weather to prevent rides from the airport to reach them. E. On the importance of a passenger-centric approach todelays

A ﬁnal application to the full door-to-door model presentedin this paper emphasizes the difference between ﬂight delayand passenger delay. Since Uber splits the day into ﬁvedifferent periods, each with their trafﬁc idiosyncrasies withrespect to peak times, we can calculate how much extra traveltime is required for a passenger when a ﬂight does not arrive inthe scheduled period. For example, a ﬂight expected to arrivein the early morning that lands after 10:00 AM could result inthe passenger getting stranded in trafﬁc when trying to leavethe airport. Though airlines are not responsible for road trafﬁc,passengers can choose ﬂights based on their arrival time toavoid peak time trafﬁc.To calculate this extra travel at aggregated level, we havecalculated the difference of average travel time between thetwo periods concerned by ﬂights not arriving according toschedule for each arrival zone. These travel time differencesare then aggregated into one travel time difference per city pairby weighing the travel time associated to each census tractwith the proportion of passengers initiating their trips fromthere, or ﬁnishing their trip there. The number of passengersoriginating from or ﬁnishing within a census tract is assumedto be proportional to the population density of the consideredcensus tract.Another measure of sensitivity is to consider the maximumdifference between the maximum travel times of each zonebetween the two considered periods. This second measureindicates the worst variation of the travel time upper bound,i.e. the maximum difference a traveler can experience goingfrom the airport to their ﬁnal destination zone.Let us consider the ﬂight UA460 from LAX to SFO sched-uled to arrive on Thursday February 15, 2018 at 18:02 localtime and that landed with a minor delay of 16 minutes. Dueto the 45-minute processing time required to leave the airport,this 16-minute delay shifts the departure period from theairport from afternoon (PM) to late evening. The aggregatedaverage extra travel time is of 15 minutes and 40 seconds, i.e.a 16-minute ﬂight delay triggered an average 31-minute totaldelay for the passengers. Looking at the second consideredmeasure, the maximum travel time difference for this ﬂightdelay is of 72 minutes, meaning that one passenger couldexperience a delay of 88 minutes resulting from this 16-minuteﬂight delay. This ﬁrst example illustrates that passenger delaysand aircraft delays are distinct.Paradoxically, arriving earlier than scheduled for a ﬂightdoes not necessarily mean that the full door-to-door tripends earlier. For example, ﬂight VX1929 from LAX to SFOscheduled to arrive on Thursday February 8, 2018 at 15:22local time actually landed 25 minutes earlier. This impliedthat the passengers were no longer leaving the airport in theafternoon (PM) period but at midday. The aggregated averageextra travel time is here of 15 minutes and 2 seconds, so onaverage travelers did arrive earlier than scheduled, but only byabout ten minutes and not the twenty-ﬁve minutes announced by the airline. However, looking at the second measurementmethod again, the maximum ride time difference is 66 minutesand 44 seconds, which means that a passenger could end uparriving forty minutes later than if the ﬂight had landed ontime. V. D

ISCUSSION & C

ONCLUSION

By leveraging Uber’s recently released data and combiningit with several other available data sources, this paper proposeda data-driven model for the estimation of full door-to-doortravel times for multi-modal trips both in Europe and in theUnited States. Though the model is used for one city pairin Europe and six different cities in the United States, it canbe implemented between any world city pair with sufﬁcientavailable ride-sharing or taxi data. The proposed model canbe adapted depending on how much data about the consideredtravel modes are available, since a weekly schedule containingno information relative to delays can already lead to somemeaningful insights from both a passenger perspective and acity planner perspective.Once aggregated at a city level, the presented door-to-doortravel time model can be used for a paired comparison ofthe different phases of the full door-to-door journey betweentwo cities. It also enables us to analyze the actual time spentwhile travelling between two speciﬁc cities. The evaluation ona national level of some of the passenger-centric objectivesproposed within NextGen in the United States and withinACARE FlightPath 2050 in Europe is possible thanks to theproposed model, especially the objectives regarding how wellintegrated airports are within their cities. The model can alsoprovide insight to how multi-modal trips are affected by severeweather disruptions, indicating where improvements can bemade. It also brings a valuable measurement of the differencebetween ﬂight delays and passenger delays, emphasizing theneed for passenger-centric metrics to evaluate the performanceof the air transportation system, which is not solely constitutedof planes.Future studies should consider integrating additional data,such as alternative transportation modes, e.g. the subway, inthe model once they are available and when calculating thetime needed to reach the departure station or leave from thearrival station. Additionally, the knowledge of the actual dailyproportion of passengers travelling via the different approachmodes (road or rail) would lead to a better precision ofthe proposed full door-to-door travel time model. Aggregatedinformation from GPS or mobile phone sources could possi-bly be used to determine this proportion without infringingpassenger privacy. A

CKNOWLEDGMENTS

The authors would like to thank Nikunj Oza from NASA-Ames, the BDAI team from Verizon Media in Sunnyvale aswell as the

Ecole Nationale de l’Aviation Civile and KingAbdullah University of Science and Technology for theirﬁnancial support. The authors are also grateful for the helpof Marine Lebeater for her feedback. The authors would alsolike to thank the SESAR Joint Undertaking for its support of the project ER4-10 ”TRANSIT: Travel Information Man-agement for Seamless Intermodal Transport”. Data retrievedfrom Uber Movement, (c) 2020 Uber Technologies, Inc.,https://movement.uber.com.R EFERENCES[1] M. Darecki, C. Edelstenne, E. Fernandez, P. Hartman, J.-P. Herteman,M. Kerkloch, I. King, P. Ky, M. Mathieu, G. Orsi, G. Schotman,C. Smith, and J.-D. W¨orner,

Flightpath 2050: Europe’s Vision forAviation ; Maintaining Global Leadership and Serving Society’s Needs ;Report of the High-Level Group on Aviation Research , E. Commission,Ed. Luxembourg: European Commission, 2011.[2]

NextGen Priorities - Joint Implementation Plan Update Including theNortheast Corridor . Federal Aviation Administration, Oct. 2017.[3] J. Meserole and J. Moore, “What is System Wide Information Man-agement (SWIM)?” in . Portland, OR, USA: IEEE, Oct. 2006, pp. 1–8.[4] NextGen Integration and Implementation Ofﬁce, “NextGen Implemen-tation Plan,” in

Federal Aviation Administration , 2009.[5] Y. O. Gawdiak and T. Diana, “NextGen Metrics for the Joint Planningand Development Ofﬁce,” vol. 11th AIAA Aviation Technology, Integra-tion, and Operations (ATIO) Conference, including the AIAA BalloonSystems Conference and 19th AIAA Lighter-Than, 2011.[6] A. Cook, G. Tanner, S. Crist´obal, and M. Zanin, “Passenger-OrientedEnhanced Metrics,” 2012.[7] I. Laplace, A. Marzuoli, and E. Feron, “META-CDM: Multimodal,Efﬁcient Transportation in Airports and Collaborative Decision Making,”in

Airports in Urban Networks 2014 (AUN2014) , Paris, 2014.[8] S. H. Kim, A. Marzuoli, J.-P. Clarke, D. Delahaye, and E. Feron,“Airport Gate Scheduling for Passengers, Aircraft, and Operation,” in

Tenth USA/Europe Air Trafﬁc Management Research and DevelopmentSeminar (ATM2013) , Chicago, Illinois, Jun. 2013.[9] L. Dray, A. Marzuoli, and A. Evans, “Air Transportation and Mul-timodal, Collaborative Decision Making during Adverse Events,” in

Eleventh USA/Europe Air Trafﬁc Management Research and Develop-ment Seminar (ATM2015) , Lisboa, Portugal, Jun. 2015.[10] S. Peer, J. Knockaert, P. Koster, Y.-Y. Tseng, and E. T. Verhoef,“Door-to-door travel times in Revealed Preference departure time choicemodels: An approximation method using GPS data,”

TransportationResearch Part B: Methodological , vol. 58, pp. 134–150, 2013.[11] M. Salonen and T. Toivonen, “Modelling travel time in urban networks:Comparable measures for private car and public transport,”

Journal ofTransport Geography , vol. 31, pp. 143–153, Jul. 2013.[12] E. Dur´an-Hormaz´abal and A. Tirachini, “Estimation of travel timevariability for cars, buses, metro and door-to-door public transport tripsin Santiago, Chile,”

Research in Transportation Economics , vol. 59, pp.26–39, Nov. 2016.[13] E. Pels, P. Nijkamp, and P. Rietveld, “Access to and competition betweenairports: A case study for the San Francisco Bay area,”

TransportationResearch Part A: Policy and Practice , vol. 37, no. 1, pp. 71–83, Jan.2003. [14] A. Marzuoli, E. Boidot, E. Feron, and A. Srivastava, “Implementing andValidating Air Passenger–Centric Metrics Using Mobile Phone Data,”

Journal of Aerospace Information Systems , vol. 16, no. 4, pp. 132–147,Apr. 2019.[15] A. Marzuoli, P. Monmousseau, and E. Feron, “Passenger-centric metricsfor Air Transportation leveraging mobile phone and Twitter data,” in

Data-Driven Intelligent Transportation Workshop - IEEE InternationalConference on Data Mining 2018 , Singapore, Nov. 2018.[16] P. Garc´ıa-Albertos, O. G. Cant´u Ros, R. Herranz, and C. Ciruelos,“Understanding Door-to-Door Travel Times from Opportunistically Col-lected Mobile Phone Records,” in

SESAR Innovation Days 2017 , 2017.[17] W. Grimme and S. Maertens, “Flightpath 2050 revisited – An analysis ofthe 4-hour-goal using ﬂight schedules and origin-destination passengerdemand data,”

Transportation Research Procedia , vol. 43, pp. 147–155,2019.[18] X. Sun, S. Wandelt, and E. Stumpf, “Competitiveness of on-demandair taxis regarding door-to-door travel time: A race through Europe,”

Transportation Research Part E: Logistics and Transportation Review ,vol. 119, pp. 1–18, Nov. 2018.[19] P. Monmousseau, D. Delahaye, A. Marzuoli, and E. Feron, “Door-to-door travel time analysis from Paris to London and Amsterdam usingUber data,” in

Ninth SESAR Innovation Days , Athens, Greece, 2019.[20] ——, “Door-to-door Air Travel Time Analysis in the United States usingUber Data,” in

Science Advances , vol. 5, no. 5, May 2019.[24] J. D. Hall, C. Palsson, and J. Price, “Is Uber a substitute or complementfor public transit?”

Journal of Urban Economics , vol. 108, pp. 36–50,Nov. 2018.[25] M. Wang and L. Mu, “Spatial disparities of Uber accessibility: Anexploratory analysis in Atlanta, USA,”

Computers, Environment andUrban Systems arXiv:1508.04839 [cs]arXiv:1508.04839 [cs]