Bayesian hierarchical models for the prediction of the driver flow and passenger waiting times in a stochastic carpooling service
Panayotis Papoutsis, Bertrand Michel, Anne Philippe, Tarn Duong
BBayesian hierarchical models for the prediction of the driver flow andpassenger waiting times in a stochastic carpooling service
Panayotis Papoutsis a,b,c , Bertrand Michel a,b , Anne Philippe b and Tarn Duong c a Department of Computing and Mathematics, Nantes Central Engineering School, F-44300, France; b Jean Leray Mathematics Laboratory, University of Nantes, F-44300, France; c Ecov, F-44200, France
ARTICLE HISTORY
Compiled July 20, 2020
ABSTRACT
Carpooling is an integral component in smart carbon-neutral cities, in particular to facilitatehome-work commuting. We study an innovative carpooling service developed by the start-upEcov which specialises in home-work commutes in peri-urban and rural regions. When a passen-ger makes a carpooling request, a designated driver is not assigned as in a traditional carpoolingservice; rather the passenger waits for the first driver, from a population of non-professionaldrivers who are already en route, to arrive. We propose a two-stage Bayesian hierarchical modelto overcome the considerable difficulties, due to the sparsely observed driver and passenger datafrom an embryonic stochastic carpooling service, to deliver high-quality predictions of driverflow and passenger waiting times. The first stage focuses on the driver flows, whose predictionsare aggregated at the daily level to compensate the data sparsity. The second stage processesthis single daily driver flow into sub-daily (e.g. hourly) predictions of the passenger waitingtimes. We demonstrate that our model mostly outperforms frequentist and non-hierarchicalBayesian methods for observed data from operational carpooling service in Lyon, France andwe also validated our model on simulated data.
KEYWORDS
Hierarchical modelling; Gamma regression; GPS traces; MCMC; Multi-level moving average
1. Introduction
Providing ecologically sustainable transportation that is accessible for all is one of the key chal-lenges in the transition to post-carbon societies. An innovative solution devised by the Frenchstart-up, Ecov ( ), is a carpooling service for rural and urban peripheral regions.These regions are often neglected by the start-up sector as it tends to focus on technology-savvypopulations in dense urban regions. These carpooling services, whilst having been initiated inthe private sector, are developed and operated in close collaboration with local government au-thorities in order to satisfy the mobility requirements in these marginalised areas with sparserpopulation and physical/digital infrastructure. The key innovation brought to the market byEcov is the provision of carpooling lines, which closely resemble traditional bus lines. These car-pooling lines link physical meeting points between which carpooling is assured at suitable regu-larity. This concentrates the demand and the supply of carpooling to reach a critical mass morequickly and more sustainable. The meeting points are placed strategically in highly frequentedareas, which take into account various factors such as aggregated traffic flow, socioeconomiccharacteristics, pedestrian accessibility, local government regulations, etc. Pick ups and drop-offs at other locations than these meeting points are not facilitated by the provider, though theyare not disallowed, which ensure more flexibility than bus lines. The meeting points resemblebus shelters, except where a passenger waiting for a bus usually only requires a simple handgesture to the driver to indicate that they intend to embark, the carpooling passenger must
CONTACT Panayotis Papoutsis. Email: [email protected] a r X i v : . [ s t a t . A P ] J u l ake an explicit carpooling request on an electronic console. This request is then displayedon a electronic sign on the roadside which informs all passing drivers of a passenger requestto a specified destination. This driver is not allocated in advance – this real-time, stochasticmatching between a passenger and driver, is a major distinguishing feature of Ecov carpoolingservices in contrast to their competitors (such as Uber, Lyft, Kapten etc). It is this stochasticmatching between a passenger and a flow of potential drivers, along with the aggregating effectsof the physical meeting points, that enable carpooling to reach economical feasibility in sparselypopulated regions.The stochastic matching from a mathematical and technological point-of-view is more difficultthan the deterministic passenger-driver matching in order to provide a reliable waiting time ofa driver arrival. In the latter, a reliable waiting time for a passenger request requires only thetracking of an allocated driver, whereas stochastic matching requires both (a) the tracking ofmultiple potential drivers and (b) an understanding of the general driver flow. The digital tech-nological infrastructure is key in delivering reliable waiting times to passengers. Ecov providesusers with a mobile phone application for their carpooling services: passengers receive updatesabout the waiting time for a driver arrival, and drivers receive notifications of passengers wait-ing at the meeting points, and crucially, are able to share their GPS locations in real-time withEcov. These driver GPS traces, by providing pertinent information, ensure the quality of thecarpooling service. This information includes the daily driver flow and the passenger waitingtime, which we focus on in this paper.Due the complexity of the relationship between the driver GPS traces and the passenger wait-ing times, and the scarcity of the observed data due to the novelty of the stochastic carpooling,we propose a hierarchical approach where we first build predictive models of the potential driverflow from the observed GPS driver traces. At the time when a passenger request is made, wedo not have a sufficiently detailed knowledge of the instantaneous potential driver flow, so wemodel this driver flow first as a moving average of previous driver flows. Then we model thepassenger waiting time as a regression model with covariates based on the driver traffic flowmodelled in the first stage.In the flowchart in Figure 1, our Bayesian multi-level hierarchical model is composed of twonested stages. The input data (driver GPS traces) are preprocessed, as outlined in [11], so thatthey are suitable as subsequent input into the hierarchical models themselves. The first model isa multi-level moving average model whose coefficients θθθ with levels depending on if the currenttype of day: working, weekend, public or school holiday. Bayesian multi-level models cruciallyare able to suitably model the driver flows with these overlapping levels (e.g. the driver flowfor public holiday which is also a school holiday is different to that of public holidays outsideof the school holiday period). These levels are known to be highly influential [1].The outputfrom the first hierarchical model is the daily driver flow, which is the immediate input to thesecond hierarchical model. The latter is a Gamma regression, whose regression coefficient βββ has S components for each of the time intervals into which a 24-hour period is divided. The role of βββ is to assign the daily traffic flow to each of these sub-daily time intervals. The output of thissecond hierarchical model is the temporal profile of the passenger waiting times for each ofthe sub-daily time intervals. The scarcity of the input data (driver GPS traces) only allows usto model the driver flow robustly at a daily level, whereas a higher temporal resolution of theoutput passenger waiting times is required for a carpooling service. Bayesian hierarchical modelsoffer an intuitive treatment of these differing temporal resolutions within a single workflow.In Section 2 the first stage of the hierarchical model for the daily driver traffic flow is de-scribed. In Section 3 the second stage of the hierarchical model for the passenger waiting timesis described. In Section 4 we validate our model on simulated data and then compare itselfperformance with frequentist and non-hierarchical Bayesian models on empirical data from anoperational carpooling service. We end the paper with a discussion and some future perspectives.2 nput: DriverGPS tracesData pre-processing Hierarchicalmodel 1:Multi-levelmoving average Output:Daily driverflow y Hierarchicalmodel 2:Gammaregression Output: Sub-daily passengerwaiting times
Model 1parameters: θθθ
Model 2parameters: βββ
Figure 1.
Flowchart of Bayesian hierarchical model for driver flow and passenger waiting time prediction. The input data(driver GPS traces) are in grey, the hierarchical models in green, the model parameters in orange, and the model outputsin purple.
2. Bayesian multi-level moving averages of the daily driver flow
As the driver flow is a fundamental quantity in transportation research, its estimation/predictionis the subject of a vast field of active research so we cite only those references with a directconnection with the analysis presented in this paper. Historically the simplest models are themoving window averages, see for instance [13]. More advanced methods draw from time seriesanalysis, within a frequentist [3] or a Bayesian framework [6] have been posited. Our proposedapproach of a Bayesian multi-level moving average is a combination of the approaches of [13]and [6] which combines the robustness and simplicity of moving averages, with the targetedadjustments of multi-level coefficients. The empirical data in this paper are extracted fromthe Lane stochastic carpooling service ( lanemove.com ) operated by Ecov, in conjunction withInstant System ( instant-system.com ), since May 2018 in the south-eastern peri-urban regionsaround Lyon, France. See [11] for more details on its set-up. We focus on the driver GPS tracesfor the 382 days from 2018-05-15 (service launch) to 2019-05-31 (beginning of the following year’ssummer holiday season in France). The daily driver flows in the Lane network are presented inFigure 2, where we enumerate each trajectory, rather than each unique driver. So a single drivercan make several trajectories within this time period. The colour coding is induced by the daytype, defined as DT( i ) = ORD if day i is an ordinary workdaySCH if day i is a school holidayPWE if day i is a public holiday or a weekend (1)where i is the index of the day from the service launch, i.e. i = 1 for 2018-05-15 etc. In Figure 2,the work days are in orange, the school holidays in green, and the public holidays/weekends inblue. The classic temporal cycles of driver flow data in the left panel are present which indicatethat a moving average is relevant approach for prediction. According to [1, 9] these day typesare a key determinant of home-work daily commutes, which is verified by the box plots of thedaily driver flow by day type in the right panel. The daily driver flow for a work day approaches200 trajectories, which is about four times larger than driver flow on school holidays, and morethan 10 times larger than on the public holiday/weekends. From visual inspection of the daily driver flow in Figure 2 , a standard moving average whichignores these day types would be unable to account for the abrupt differences in driver flowwhen consecutive days are of different day types. The recurrence relation of the daily driver flow3 igure 2.
Daily driver flow on the Lane carpooling network, from 2018-05-15 to 2019-05-31. Left. Daily times series. Right.Aggregate driver flow by day type. The ordinary work days (ORD) are in orange, the school holidays (SCH) in green andthe public holidays/weekend days (PWE) in blue. y i , on day i ≥ K ≥
1, satisfies y i = α DT( i ) K (cid:88) k =1 η DT( i − k ) y i − k + ε i (2)= α DT( i ) K (cid:88) k =1 [ { DT( i − k ) = ORD } + η SCH { DT( i − k ) = SCH } + η PWE { DT( i − k ) = PWE } ] y i − k + ε i where α DT( · ) is the coefficient for the current day i and η DT( · ) are the coefficients for the past K driver flows and ε i , i = 1 , , . . . are a sequence of independent normal random variables N (0 , σ ε ).To ensure identifiability of η DT( · ) , without loss of generality, we set η ORD = 1 for all days. Toreduce the mathematical complexity, we assume that the ratios of the mean daily driver flowof the three day types to each other (¯ y i, ORD : ¯ y i, SCH : ¯ y i, PWE ) remain constant for all days i inthe entire time period. The model in Equation (2) has a moving average structure of order K ,but with two additional multi-level coefficients that make the average adaptive to the day typesfor the current day i and the previous K days. For example, if day i is a school holiday, thenthe right hand side of Equation (2) is α SCH (cid:80) Kk =1 [ { DT( i − k ) = ORD } + η SCH { DT( i − k ) =SCH } + η PWE { DT( i − k ) = PWE } ] y i − k . In the summand, the day type indicator functionsallows us to sum over the K previous days, even if they are of different types. If a previousday is a work day, then its contribution to the current driver flow is α SCH y i − k ; if a previousday is a school holiday then it is α SCH η SCH y i − k ; if a previous day is a public holiday/weekendthen it is α SCH η PWE y i − k . The first multi-level coefficient α SCH models the current driver flow,conditionally on its day type. The second set of multi-level coefficients η SCH , η
PWE re-scale theprevious driver flows, assuming that the ratio of the flows of different day types is constant forall days.Our model in Equation (2) possesses a similar structure to an autoregressive model, though itdoes not strictly satisfy the definition of one. It cannot be defined with a back shift operator dueto the action of the multi-level coefficients α DT( · ) and η DT( · ) , and the process { y i , i = 1 , , . . . } isnon-stationary due to the drift in the driver participation rate after the launch of the carpoolingservice.The multi-level model in Equation (2) is equally valid for frequentist or Bayesian ap-proaches for parameter estimation. We adopt a Bayesian approach, in line with [5]. Let θθθ = ( α ORD , α
SCH , α
PWE , η
ORD , η
SCH , η
PWE , σ ε ) though recall that we fix η ORD = 1 identically.Suppose that we have N days of observed daily driver flows y i , i = 1 , . . . , N , where N > K ,the order of the moving average. Since the error variables are independent Gaussian, then the4onditional likelihood of yyy = ( y K , y K +1 , . . . , y N ) is L ( yyy | θθθ ) = 1(2 πσ ε ) ( N − K +1) / exp (cid:34) − σ ε N (cid:88) i = K ( y i − g i ( θθθ )) (cid:35) where g i ( θθθ ) = α DT( i ) (cid:80) Kk =1 η DT( i − k ) y i − k . This conditional likelihood is formed by the product ofthe conditional densities of y i given y i − K , . . . , y i − for i = K + 1 , . . . , N .In Bayesian analysis the parameter of interest θθθ is a random variable, and its prior distribution π represents our belief in its uncertainty. The posterior density π ( θθθ | yyy ) represents an update ofthe prior distribution by taking into account the observed data: π ( θθθ | yyy ) ∝ L ( yyy | θθθ ) π ( θθθ ).In our case, we do not have access to existing knowledge that would provide an informativeprior and thus we form a non-informative prior on θθθ , i.e. π ( θθθ ) ∝ σ − ε [2, Chapter 1]. This leadsto the following posterior distribution π ( θθθ | yyy ) ∝ L ( yyy | θθθ ) π ( θθθ ) ∝ σ N − K +1 ε exp (cid:34) − σ ε N (cid:88) i = K ( y i − g i ( θθθ )) (cid:35) . (3)For the inference on θθθ , Monte Carlo approximations are require since the posterior distribution(and its moments, quantiles etc.) cannot be calculated explicitly. The most widely used familyof methods is the Monte Carlo Markov Chain (MCMC) which aim to generate a Markov Chain { θθθ , θθθ , . . . } whose equilibrium distribution converges to the posterior distribution π ( θθθ | yyy ).The next stage is to predict a driver flow ˜ y in the future from the observed past data yyy .Bayesian prediction is based on the posterior predictive distribution, that is the distribution of˜ y conditional on the observed past data yyy . Its density p (˜ y | yyy ) is given by p (˜ y | yyy ) = (cid:90) Θ p (˜ y | θθθ, yyy ) π ( θθθ | yyy ) dθθθ. (4)Since p (˜ y | yyy ) is a compound probability distribution, we can easily simulate samples from thispredictive distribution. We begin with defining the daily driver flow recurrence with no day types. So Equation (2) withday types simplifies to y i = α K (cid:88) k =1 y i − k + ε i . Since the multi-level coefficients for the day types are no longer present, this is indeed anautoregressive model. Algorithm 1 simulates a driver flow for a single day with no day types.The inputs are the day i , the coefficient α , the autoregression order K , and the error variance σ ε . The output is a single driver flow for day i . The repeat loop ensures that the simulateddriver flow is strictly positive. To simulate a sequence of N driver flows, we initialise the valuesgenerated by Algorithm 1 for i = 1 , . . . , K days, and then iterate Algorithm 1 sequentially for i = K + 1 , . . . , N .With Algorithm 1 defined, it is straightforward to define one with day types (i.e. Equation (2))in Algorithm 2. This has similar inputs the day i , the day type coefficients θθθ , the autoregressionorder K , the error variance σ ε , and except that the scalar α is replaced with the vector coefficients θθθ . The output is the daily driver flow for day i , accounting for the day types of the days precedingday i . 5 lgorithm 1: Daily driver flow without day types procedure TrafficFlow ( i, α, K, σ ε ) if i ¡= K then initialise y ←− N (30 , σ ε ) else repeat y ←− N ( α (cid:80) Kk =1 TrafficFlow ( i − k, α, K, σ ε ) , σ ε ) until y > end return : y driver flow for day i Algorithm 2:
Daily driver flow with day types procedure TrafficFlowDT ( i, θθθ, K, σ ε ) if DT( i ) == ORD then y ←− TrafficFlow ( i, α ORD , K, σ ε ) else if DT( i ) == SCH then y ←− TrafficFlow ( i, α SCH η SCH , K, σ ε ) else if DT( i ) == PWE then y ←− TrafficFlow ( i, α PWE η PWE , K, σ ε ) end end end return : y driver flow for day i For the choice of an MCMC sampler, we use the NUT sampler [8]. The NUT sampler isused by default in the pyStan package ( https://pystan.readthedocs.io ), a Python interfaceto Stan ( https://mc-stan.org ), which is a state-of-art platform for Bayesian computations,amongst other functionalities.To carry out the complicated integration and then a random drawfrom the posterior predictive distribution of daily driver flows p (˜ y | yyy ) in Equation (3), we are onlyrequired to input the prior π ( θθθ ), the likelihood L ( yyy | θθθ ) and the recurrence relation which generatesthe vector of simulated driver flows yyy (i.e. Algorithm 2) into pyStan. The latter automaticallycalculates, for the j -th iteration, j = 1 , . . . , J , the vector of N replicates drawn from the posteriorpredictive distribution: ˜ yyy ( j ) = y ( j, ... y ( j,N ) ∼ p (˜ y ( j, | yyy )... p (˜ y ( j,N ) | yyy ) . (5)The final output is the sequence of posterior prediction vectors ˜ Y = { ˜ yyy (1) , . . . , ˜ yyy ( J ) } .
3. Bayesian Gamma regression of the passenger waiting times
From the perspective of a passenger in a carpooling service, a pertinent measure of the servicequality is the waiting time for a driver to arrive after the carpooling request is made. In astochastic matching carpooling service, the daily driver flow is the predominant factor in deter-mining this waiting time, unlike for deterministic services where it plays a minor role. So theanalysis of driver flow from the previous section plays an important role in passenger waiting6ime prediction, as illustrated in the flowchart in Figure 1. Established methods for waitingtime prediction tend to be frequentist approaches based on Poisson driver arrivals, see [11, 12]which have encountered varying degrees of success. The aim of the section is to introduce moreaccurate Bayesian regression models which rely less on the Poissonian assumptions on the driverflows.For simplicity, we assume that a passenger can only make one request at a time for themselvesonly at a carpooling meeting point, and the drivers embark can only one passenger in their vehiclein the order that the passenger requests are made.For day i , let y i be the daily traffic flow, and w i, , . . . , w i,n i be the waiting times for the n i passengers who make a carpooling request at time t i, < · · · < t i,n i respectively. Let t (cid:48) i,j be thedriver arrival times for the passenger request at time t i,j , i = 1 , . . . , N and j = 1 , . . . , n i . Theperceived waiting time for passenger request at time t i,j is w ∗ i,j = t (cid:48) i,j − t i,j and the pseudo waiting time is w i,j = t (cid:48) i,j − max( t i,j , t (cid:48) i,j − )with the convention t (cid:48) i, = t i, for the first passenger on day i . Figure 3 illustrates the differencebetween the perceived and pseudo waiting time for the case of two passengers A, B who are bothnot the first passenger of the day. Passenger A arrives first and is the j -th, with j >
1, passengerof day i , and makes a carpooling request at time t i,j . Passenger B arrives immediately afterwardsand is ( j + 1)-th passenger with request time t i,j +1 . Suppose that there are at least two driversen route to embark these passengers, and who have not received any passenger requests beforepassenger A’s request. The first driver arrives at t (cid:48) i,j > t i,j +1 (i.e. after passenger B ’s requesttime) and the second driver at t (cid:48) i,j +1 . The perceived waiting time for the passenger A is denoted w ∗ i,j = t (cid:48) i, j − t i,j (the blue brace in Figure 3) and for the passenger B is w ∗ i, +1 j = t (cid:48) i,j +1 − t i,j +1 (the green brace). The pseudo waiting for passenger A is w i,j = w ∗ i,j since they are at the frontof the queue, and for passenger B is w i,j +1 = t (cid:48) i,j +1 − t (cid:48) i,j is the grey brace. The pseudo waitingtime for passenger B is the difference between their departure and the departure of the previouspassenger A, and this is shorter than the perceived waiting time w ∗ i,j +1 . Figure 3.
Perceived and pseudo waiting times for the case of two passengers at a carpooling meeting point. PassengerA is at the head of the queue so their perceived waiting time (blue brace) coincides with their pseudo waiting time. Forpassenger B their pseudo waiting time (grey brace) is the difference between their departure and the departure of theprevious passenger A, which is shorter than their perceived waiting time (green brace).
We focus on the the pseudo waiting times rather than the perceived waiting times in ourmodel. From Figure 3, we observe that perceived waiting times w ∗ i,j and w ∗ i,j +1 for PassengersA and B overlap, whereas the pseudo waiting times w i,j = w ∗ i,j and w i,j +1 , by construction,7o not overlap. The overlapping nature of the interval processes that determine the perceivedwaiting times renders the problem of their prediction non-identifiable and that is why we haveintroduced the pseudo waiting times. Moreover, it is possible to predict the perceived waitingtimes from the pseudo waiting times, given known passengers behaviours, e.g. there is alreadya passenger waiting for a car since t minutes before the arrival of another passenger. Thus if‘waiting time’ is employed without any qualifier, it is assumed to be the pseudo waiting time.Whilst the acquisition protocols for the driver GPS traces have been functioning well since thelaunch of the Lane carpooling service 2018-07-15, this was not the case for the passenger waitingtimes due to persistent technical operational difficulties for more than a year after the servicelaunch. This leads to a highly challenging situation in which to deliver robust passenger waitingtime predictions. We focus on the observed passenger pseudo waiting times covering the periodfrom 2019-07-25 to 2020-02-17. In Figure 4 are the 1500 observed passenger pseudo waiting timesin the Lane carpooling service, from 2019-10-22 to 2020-01-15 (we plot a sub-sample of the totalperiod for a better visualisation). This range of dates is different from those for the driver GPStraces since, due to operational technical difficulties from the service launch on 2018-05-15 until2019-10-21, the passenger waiting times were not reliably recorded so they are excluded fromthe analysis. The operation of the Lane service is guaranteed only for work days (including someschool holidays), though this does not prevent passengers and drivers from using the service forother days, there are nonetheless far fewer carpooling requests for school holiday weekdays andthere are none for the public holidays/weekends. Figure 4.
Observed pseudo waiting times (in minutes) in the Lane carpooling network from 2019-10-22 to 2020-01-15.The ordinary work days (ORD) are in orange, the school holidays (SCH) in green.
In the previous section, we implemented estimations of the driver flow at the daily level. For thewaiting times, we wish to formulate predictions at sub-daily resolution. Let the 24 hour periodof a day be divided into S equal intervals I < · · · < I S . The fraction of the daily driver flow oneach interval I s , s = 1 , . . . , S is y i β s , where β s ≥ (cid:80) Ss =1 β s = 1. Conditional on the trafficflow y i and that passenger request times t i,j ∈ I s , we suppose that the pseudo waiting times w i,j are independent Gamma random variables with parameters ν and β s y i : w i,j | ( y i , βββ, t i,j ∈ I s ) ∼ Γ( ν, β s y i ) for i = 1 . . . N and j = 1 , . . . , n i . (6)This model specification ensures that the conditional mean pseudo waiting time is E [ w i,j | ( y i , βββ, t i,j ∈ I s )] = νβ s y i which is consistent with our intuition of the inverse relationship between the driver flow and thewaiting time. Since βββ is constant for all i , then the model assumes that the relative proportionsof the traffic flow in the intervals I , . . . , I S remain unchanged for all driver flow values.8n Figure 5 are the mean observed daily traffic flows for each weekday from the Lane carpoolingservice, where the day is divided into 15 minute intervals ( S = 96). Since the service operatinghours are 06:00 – 09:00 and 16:00 – 19:00, there are few drivers outside them. Each dot in thefigure is the mean number of drivers for each 15 minute interval for each week day from 2018-05-15 to 2019-05-31. Each week day has a similar shape so this gives some empirical justificationfor supposing a constant βββ for all days. Figure 5.
Mean driver flows for 15 minute interval for each weekday for the Lane carpooling service, from 2018-05-15 to2019-05-31. Monday is in blue, Tuesday in orange, Wednesday in green, Thursday in pink, Friday in violet.
A Dirichlet distribution is a natural choice as a prior distribution on the coefficients βββ : βββ ∼ Dir(
S, ααα ) where ααα = ( α , . . . , α S ) are the concentration parameters, since it imposes theconstraint (cid:80) Ss =1 β s = 1 on the coefficients.The corresponding Dirichlet density is p ( βββ ) = 1B( ααα ) S (cid:89) s =1 β α s − s where B ( ααα ) = (cid:81) Ss =1 Γ( α s ) (cid:14) Γ (cid:0) (cid:80) Ss =1 α s (cid:1) and Γ( x ) = (cid:82) ∞ u x − e − u du . Nonetheless, for the situa-tions where we cannot assure that the sum of the βββ coefficients is always 1, then a non-informativeprior (i.e. the Lebesgue measure on R S + ) is preferred. The GPS driver traces collected by Ecovfrom the mobile application represent an incomplete subset of the complete driver populationof interest as they exclude (a) the drivers who are registered in the Lane carpooling service, butdo not share their geolocation with Ecov, and (b) the drivers who are currently not registeredbut are potential participants in the carpooling service. So in this case of an incomplete driverpopulation, the βββ coefficients do not necessarily sum to 1.The βββ vector allows us to rebuild the temporal distribution of the traffic flow within a dayfrom an aggregated daily driver flow. In the cases when the driver flow can be observed at asub-daily level, we still prefer to apply our multi-level moving average model in Equation (2)to predict a daily driver flow as (i) it improves the robustness and (ii) it is straightforwardto change the temporal resolution I , . . . , I S of the waiting time predictions without having tore-generate the driver flows.Let ttt i = ( t i, , . . . , t i,n i ) be the vector of the n i observed passenger carpooling request timesfor the day i ∈ { , . . . , N } , ttt = ( ttt , . . . , ttt N ) be all observed passenger carpooling request times,and likewise for the passenger pseudo waiting times i for day i , for all days. Also, let thevector yyy = ( y , . . . , y N ) be the observed driver flows for all days. It is reasonable to assume thatthe waiting times are mutually independent conditionally to ( βββ, yyy, ttt ). The conditional likelihoodof the passenger waiting times is thus given by the joint density of given ( βββ, yyy, ttt ) L ( | βββ, yyy, ttt ) = N (cid:89) i =1 p ( i | yyy, ttt, βββ ) = N (cid:89) i =1 p ( i | y i , ttt i , βββ )9ince the conditional density of i given the driver flow y i , passenger carpooling request times ttt i and the coefficient βββ is p ( i | βββ, y i , ttt i ) = S (cid:89) s =1 (cid:89) { j : t i,j ∈ I s } ( β s y i ) ν Γ( ν ) w ν − i,j exp( − β s y i w i,j ) . Then we obtain the posterior density of βββ , using a non-informative prior on βββ , as π ( βββ | ) ∝ N (cid:89) i =1 S (cid:89) s =1 (cid:89) { j : t i,j ∈ I s } ( β s y i ) ν Γ( ν ) w ν − i,j exp( − β s y i w i,j ) { βββ ∈ R S + } . Let ˜ w s be the pseudo waiting time for a future, unobserved day for a passenger who makesa carpooling request in the time interval I s . If we observe a new daily driver flow ˜ y , then theposterior predictive distribution of the waiting time ˜ w s is p ( ˜ w s | ˜ ) = (cid:90) p ( ˜ w s | ˜ y, β s ) π ( β s | ) dβ s . (7)If ˜ y is not observed, then the predictive distribution becomes p ( ˜ w s | ) = (cid:90) ∞ (cid:20)(cid:90) p ( ˜ w s | ˜ y, β s ) π ( β s | ) dβ s (cid:21) p (˜ y | yyy ) d ˜ y (8)where p (˜ y | yyy ) is defined in Equation (4). Equation (7) applies when we wish to make a predictionfor the current day where we have observed a driver flow, and Equation (8) for a future daywhere we have not yet observed the driver flow. Finally we wish to predict the waiting time forall the time intervals I , . . . , I S so we collate them into an S -vector( p ( ˜ w | ˜ ) , . . . , p ( ˜ w S | ˜ )), or analogously with p ( ˜ w s | ) , s = 1 , . . . , S . Algorithm 3 simulates the passenger pseudo waiting times in Equation (6) for a sequence of days.The inputs are the number of days N , the day types coefficients θθθ , the autoregression order K ,the error variance σ ε , the first shape parameter for the Gamma distribution ν , the S regressionparameters βββ , and the number of replicates of waiting times J . The output are J replicates ofa pseudo waiting times for each time interval I s , s = 1 , . . . , S , for each day i = 1 , . . . , N . The TrafficFlowDT procedure (Algorithm 2) is called outside of the replicates loop so that eachday has one driver flow, and all waiting times are simulated from this same daily driver flow.An iteration of the nested loop in Algorithm 3 results in a single N × S matrix of pseudowaiting times drawn from the appropriate Gamma distributions: WWW ( j ) ∼ Γ( ν, β y ) . . . Γ( ν, β S y )... ...Γ( ν, β y N ) . . . Γ( ν, β S y N ) . These are then iterated J times and collated into the sequence to be the output from Algorithm 3 W = { WWW (1) , . . . , WWW ( J ) } = w (1)1 , . . . w (1)1 ,S ... ... w (1) N, . . . w (1) N,S , . . . , w ( J )1 , . . . w ( J )1 ,S ... ... w ( J ) N, . . . w ( J ) N,S . lgorithm 3: Passenger pseudo waiting times procedure WaitingTime ( N, θθθ, K, σ ε , ν, βββ, J ) S ←− Len ( βββ ) for i in do Y [ i ] ←− TrafficFlowDT ( i, θθθ, K, σ ε ) end for j in do for i in do for s in do WWW ( j ) [ i, s ] ←− Γ( ν, β s Y [ i ]) end end end return : WWW (1) , . . . , WWW ( J ) sequence of waiting time matricesAs Equation (7) generates only a single posterior prediction ˜ w s for a time interval I s , wecollate these ˜ w s for s = 1 , . . . S into an S -vector, and in turn collate N of these S -vectors ofposterior prediction distributions row-wise into a N × S matrix. To carry out the complicatedintegration and then a random draw from the posterior predictive distribution of p ( ˜ w s | ˜ )in Equation (7), we are only required to input the posterior predicted value of the driver flow˜ y (Equation (4)), the recurrence relation which generates the vector of simulated driver flows yyy (Algorithm 2), and the recurrence relation which generates the vector of simulated passengerpseudo waiting times (Algorithm 3) into pyStan. The latter automatically simulates from this N × S matrix distribution:˜ WWW ( j ) = ˜ w ( j )1 , . . . ˜ w ( j )1 ,S ... ...˜ w ( j )1 ,N . . . ˜ w ( j ) N,S ∼ p ( ˜ w ( j, | ˜ ) . . . p ( ˜ w ( j, S | ˜ )... ... p ( ˜ w ( j,N )1 | ˜ ) . . . p ( ˜ w ( j,N ) S | ˜ ) for the j -th iteration, j = 1 , . . . , J . The sequence of the matrices of replicated posterior predic-tions is ˜ W = { ˜ W ˜ W ˜ W (1) , . . . , ˜ W ˜ W ˜ W ( J ) ) } .
4. Model validation
Since the Lane carpooling data set dates from 2018, for the simulations, we set the initial day i = 1 to be 2018-01-01, and the work days (ORD), school (SCH) and public holidays/weekends(PWE) to be those observed in Lyon, France. For the simulation algorithms, the parametersare: the number of days is N = 365, the day types coefficients is θθθ = (0 . , . , . , , , K = 3, the error variance is σ ε = 5, the 24 hour period is dividedin S = 8 equal intervals of 3 hours, the first Gamma shape parameter is ν = 7, the Gammaregression parameters are βββ = (0.012, 0.01, 0.011, 0.013, 0.018, 0.016, 0.017, 0.019), and thenumber of replicates is J = 10 which corresponds to the number of observed waiting times pertime interval. These parameter values produce simulated data which is comparable to thoseobserved in the Lane carpooling service.We generate one simulated data set of N = 365 days, each with one daily driver flow y i , i =1 , . . . , N (Algorithm 2), and J = 10 passenger pseudo waiting time N × S matrices W = { WWW (1) , . . . , WWW ( J ) } (Algorithm 3), and the corresponding N × S posterior prediction matricesto form ˜ W = { ˜ WWW (1) , . . . , ˜ WWW ( J ) } . The data from these N = 365 days from 2018-01-01 to 2018-112-31 form the reference training data set. With the same parameters, we simulate a further˜ N = 5 days (2019-01-01 to 2019-01-05) of the data, which forms the oracle test data set ofthe ˜ N × S matrices to form W test = { WWW (1)test , . . . , WWW ( J )test } . Furthermore, from the training dataonly (i.e. we do not take into account the W test ), for these same extra ˜ N days, we generatethe corresponding ˜ N × S posterior prediction matrices to form ˜ W test = { ˜ WWW (1)test , . . . , ˜ WWW ( J )test } . Forbrevity we have omitted the equivalent comparison of the driver flows and focus on the passengerwaiting times for these simulated data: we make a more thorough comparison of both driverflows and passenger waiting times for the empirical data in the sequel. Whilst the daily trafficflows can be summarised by a single scalar, for the passenger waiting times, we focus on thetemporal profiles, over the S = 8 periods of a day, of the waiting times. In Figure 6 are thequantiles of the waiting times for all time intervals I , . . . , I S , for all days i = 1 , . . . , ˜ N in thetest phase. The grey box plots are of W test ,i,s and the light, medium and dark purple circlessuperimposed over the box plots are the 50%, 75%, 95% quantiles of ˜ W test ,i,s . Recall that foroperational purposes of the Lane carpooling service, short term prediction for the coming week issufficient. This is verified by the close of the quantiles of the posterior predicted pseudo waitingtimes with their observed values for all ˜ N = 5 prediction days.From a passenger point of view, whilst the magnitude of waiting time is important as aperception of the service quality, it is equally important that these posterior predicted waitingtimes be as close to the observed ones, whatever their magnitude. For example, suppose that adriver arrives after 12 minutes a passenger makes a carpooling request. In this case, a predictionof 15 minutes is better than 5 minutes since the former is closer to, but longer than, the ob-served waiting time than the latter. Therefore we propose the following metric to measure thesediscrepancies for a given threshold δ :PE( p ( W test ) , p ( ˜ W test ); δ ) = 1 J ˜ N S ˜ N (cid:88) i =1 S (cid:88) s =1 J (cid:88) j =1 {| ¯˜ w test ,i,s − w ( j )test ,i,s | < δ } (9)where ¯˜ w test ,i,s = J (cid:80) Jj =1 ˜ w ( j,i )test ,s is the mean of the posterior predicted waiting times distributionfor day i, i = 1 , . . . , ˜ N and time interval I s , s = 1 , . . . , S . This metric, as a function of δ , illustratedin Figure 7, during both the training phase PE( p ( W ) , p ( ˜ W ); δ ) (blue curve) and the test phasePE( p ( W test ) , p ( ˜ W test ); δ ) (red curve). The test predictions are more accurate than the trainingpredictions for small values of δ < δ between 2 and 8 minutes, and after 8 minutes, both curves leveloff at 1. Thus the posterior predictions from our proposed Bayesian hierarchical model can haverobust prediction performance. Our objective is the employ the two-stage Bayesian hierarchical model (Algorithms 2 and 3) topredict a passenger pseudo waiting time (distribution) for hourly intervals I s , s = 1 , . . . , S , with S = 24. For the Lane carpooling service, it is sufficient to provide the upcoming week’s predic-tions at the beginning of the week. The predicted daily driver flows from the first hierarchicalmodel are input into the second hierarchical model to produce predicted passenger pseudo wait-ing times. The latter are then compared to the observed pseudo waiting times of the passengercarpooling requests from the same period. We have the GPS traces (approximately 5 000 traces) for the 382 days from 2018-05-15 (servicelaunch) to 2019-05-31 (beginning of following summer holiday period), which we divide intodifferent training and test data sets of varying sizes depending the objectives of the analysis. Wefirst apply the preprocessing, as outlined in [11], to the driver GPS traces to the convert into12 igure 6.
Predictions of pseudo waiting times for 3-hourly intervals, for all ˜ N prediction days. The observed waiting times(Observed PWT) are the grey box plots, and the 50%, 75%, 95% quantiles of the posterior predicted waiting times (PPPWT) are the light, medium and dark purple circles. igure 7. Evolution of the PE metric of the observed and posterior predicted daily driver flows, as a function of thethreshold δ . The blue curve is for the training phase, and the red curve for the test phase. data format suitable for computing the daily driver flows y train ,i , i = 1 , . . . , N for the trainingand y test ,i , i = 1 , . . . , ˜ N for the test phases. We vary N whilst maintaining ˜ N = 7 to test variousscenarios in different periods of the year. To investigate the prediction accuracy of our proposedmodels, we divide our complete data set of into 6 different pairs of training phases with varying N (starting from 2018-05-15) and test phases with ˜ N = 7 always. In each case we select a testweek with certain characteristics as outlined in Table 1. The first column are the dates (inclusive)of the test week, the second column are the day types in the test week, the third column are arethe dates of the training weeks starting from 2018-05-15 to the previous day of the test week,and the fourth column is the number of training days ( N ). Table 1.
Training-test scenarios for daily driver flows. The first column are the dates of the test week ( ˜ N = 7), the secondis the day types in the test week, the third are the dates of the training weeks and the fourth column is the number oftraining days N . Test week Test week day types Training weeks N ) In addition to our proposed Bayesian hierarchical multi-level (BHML) predictions, we com-pute predictions from two other models: baseline frequentist (BASE) and the Bayesian Prophetmodel (PROP). The baseline frequentist model has multi-levels like our BHML, but without theBayesian moving average structure. To account for the for public/school holidays, as proposedby [7], if day i is not a school/public holiday then the average is calculated over all previousdays with the same day of week as day i ; and if day i is a public holiday, then the average isover all previous public holidays. That is, y i = 1 | T d ( i ) | (cid:88) k ∈ T d ( i ) y i − k { DT (cid:48) ( i ) (cid:54) = HOL } + 1 | T HOL ( i ) | (cid:88) k ∈ T HOL ( i ) y i − k { DT (cid:48) ( i ) = HOL } + ε i (10)where we collapse the day types function DT to DT (cid:48) ( i ) = ORW if i is an ordinary work day or aweekend day, DT (cid:48) ( i ) = HOL if i is a school or public holiday; DN is day of week number function,DN( i ) = 1 if i is a Monday, DN( i ) = 2 if i is a Tuesday etc; and T d ( i ) = { k : k < i, DN( i − k ) = d } is the set of prior days with the same day of week as day i , and T HOL ( i ) = { k : k < i, DT (cid:48) ( i − k ) =HOL } is set of public holidays before day i . 14he Bayesian Prophet model, devised by [4, 14], is an additive model with three components: y i = g ( i ) + s ( i ) + h ( i ) + ε i (11)where g ( i ) is the trend, s ( i ) is the seasonality, and h ( i ) is the holidays effect. The linear trendis g ( i ) = ( k + aaa ( i ) (cid:62) δδδ ) i + ( m + aaa ( i ) (cid:62) γγγ ) where k is the growth rate, m is the offset, aaa is thechange point indicator, δδδ is the growth rate adjustment, and γγγ is the piece-wise continuityadjustment to ensure that g is continuous. The seasonality component is a Fourier decomposition s ( i ) = (cid:80) L(cid:96) =1 [ α (cid:96) cos(2 π(cid:96)i/P ) + β (cid:96) sin(2 π(cid:96)i/P )] where ( α (cid:96) , β (cid:96) ) are the Fourier coefficients, L is thenumber of Fourier coefficients and P is the period (in days). The holiday effect is h ( i ) = hhh ( i ) (cid:62) κκκ where, say, hhh ( i ) = ( { DT( i ) = SCH } , { DT( i ) = PWE } ) is the vector of indicator variables ofthe type of holiday of day i , and κκκ is the weight vector, usually equal to the all-ones vector.[14] provide the details for the construction of the change point function aaa ( t ) and the continuityadjustment parameter γγγ . These authors set the number of Fourier coefficients to be L = 10 foryearly cycles and L = 3 for weekly cycles. What remains is to estimate the trend growth rate k ,the offset m , the growth rate adjustments δδδ and the Fourier coefficients ααα .For the training phase of dates 2018-05-15 to 2019-05-19 (Test scenario y i from Bayesian hierarchical multi-level model BHML, aswell the corresponding predictions/estimations from the frequentist baseline model BASE andthe Bayesian Prophet model PROP. In Figure 8 is the evolution of the goodness-of-fit of thethree different models for daily driver flow estimation (leaving out the first week 2019-05-15 to2019-05-22 which serves as the ‘burn-in’ period). The goodness-of-fit is measured by the MSEof the estimated and the observed daily driver flows, aggregated per week. Visually the BHMLtends to have the best goodness-of-fit (smallest MSE) for most weeks. The sum of these weeklyMSEs are: BASE: 421.9, PROP: 816.9, BHML: 297.2, which confirms our visual impression thatthe BHML achieves the best overall estimation accuracy. Figure 8.
Evolution of the goodness-of-fit of the daily driver flow estimations over the training period (2018-05-15 to2019-05-19, test scenario
We can be confident that the Bayesian hierarchical multi-level moving average model has goodestimation accuracy/goodness-of-fit, but this good performance does not necessarily translate toprediction. This non-transitivity of estimation and prediction performance is discussed in [10].So for each scenario described in Table 1, we compute the BHML, BASE and PROP modelsfor the training phase, and then the days of the test phase are input into each these models toyield the daily driver flow predictions. These predictions are presented in Figure 9: the Bayesianhierarchical multi-level BHML in purple, the frequentist baseline model BASE are in black, theBayesian Prophet PROP in green; and the observed daily driver flows are in blue. The PROPpredictions are mostly too low on week days and too high on weekends for all six test weeks incomparison to the observed driver flows, whilst the BASE and BHML appear to have comparableperformance.In Figure 10 are the MSEs between the observed and predicted daily driver flows: the fre-quentist baseline model are in black, the Bayesian Prophet PROP in green, and the Bayesianhierarchical multi-level BHML in purple. PROP is the uniformly the worst of these three models15 igure 9.
Predictions of daily driver flows for the six test week scenarios. Observed daily driver flows are in blue. Bayesianhierarchical multi-level BHML are in purple, frequentist baseline BASE in black, and Bayesian Prophet PROP in green. for all test weeks. BASE is the best for test scenario
For the case study for simulated data in Section 4.1, we could generate the oracle simulatedtemporal profiles to which the posterior predicted profiles could be compared. For the Lanecarpooling service, since it is still in an embryonic phase of operation, there are insufficientpassenger carpooling requests to robustly compute observed temporal profiles over an entireday,especially for school holidays (SCH) and public holidays/weekends (PWE) as shown inFigure 4. So it is only possible to form predictions for weekdays (ORD). In Figure 11 are thebox plots of the weekly number of observed pseudo waiting times for each hourly interval forweekdays from 2019-07-25 to 2020-02-17, but the effective end date is 2019-05-15 since the lasttwo days are weekend days. Although there are in total S = 24 hourly intervals, only those 6which correspond to the operating hours of the Lane service (06:00 – 09:00 and 16:00 – 19:00)16 igure 10. Prediction MSE of the daily driver flow predictions for the six test week scenarios. Bayesian hierarchicalmulti-level BHML are in purple, frequentist baseline BASE in black, and Bayesian Prophet PROP in green. contain any observed passenger waiting times.
Figure 11.
The box plots of the weekly number of observed pseudo waiting times for each hourly interval for weekdaysfrom 2019-07-25 to 2020-02-17.
There are a maximum of around 50-60 observed waiting times per hourly interval per week,which are not sufficient to infer robustly their distribution within each interval. To remedythis data sparsity, we aggregate a moving window of test data so for time interval I s day on i , we combine its observed pseudo waiting times w test ,i,s with those for the same time intervalfrom the previous 5 weeks with the same day of week DN( i ) and same day type DT( i ), i.e. { w test ,i − k,s : DT( i − k ) = DT( i ) , DN( i − k ) = DN( i ) , k = 1 , . . . , } . These days added to thetest data are correspondingly removed from the training data. If we aggregate the final 5 weeksto be a single test phase, then the training-test scenario is outlined in Table 2. Thus we makepredictions for only the last test week (2020-02-10 – 2020-02-17), so the number of predictionweekdays remains ˜ N = 5. Table 2.
Training-test scenario for passenger waiting times. The first column are the dates of the test weeks , the secondis the number of observed passenger waiting times in the test weeks, the third are the dates of the training weeks and thefourth column is the number of observed passenger waiting times in the training weeks.
Test weeks igure 12.
Predictions of passenger pseudo waiting times for hourly time intervals. The observed waiting times are thegrey box plots, and the 50%, 75%, 95% quantiles of the posterior predicted waiting times are the light, medium and darkpurple circles.
Lastly we consider our custom PE metric on the BHML posterior predictions:PE( p ( W ) , p ( ˜ W ); δ ) = 1˜ N S ˜ N (cid:88) i =1 S (cid:88) s =1 (cid:88) {| ¯˜ w i,s − w i,s | < δ } (12)where ¯˜ w i,s = J (cid:80) Jj =1 p ( w ( j,i ) s | ˜ ) is the mean of the posterior predicted waiting times distri-bution for day i and time interval I s . This metric, as a function of the threshold δ , illustratedin Figure 13, during both the training phase PE( p ( W ) , p ( ˜ W ); δ ) (blue curve) and the test phasePE( p ( W test ) , p ( ˜ W test ); δ ) (red curve). This is in contrast to the conclusion from Section 4.1, sincethe red curve dominates the blue curve which implies that the posterior predictions are moreaccurate during the test phase than in the training phase. This gives us confidence that theBHML posterior predictions are robust and are not based on over-fitting on the training data. Figure 13.
Evolution of the PE metric of observed and BHML posterior predicted passenger pseudo waiting times, as afunction of the threshold δ . The blue curve is for the training phase, and the red curve for the test phase.
5. Conclusion
The main contribution of this paper is the transformation of daily driver flows to passengerwaiting times for hourly intervals using a nested two-stage Bayesian hierarchical model for anoperational carpooling service. The first stage is a multi-level moving average model of the dailydriver flows, whose coefficient θθθ with levels depending on if the current day is a work day, aschool holiday or a public holiday/weekend. The second stage is a Gamma regression whoseresponse variables are the hourly passenger waiting times, covariates are the daily drive flowsfrom the first stage, and regression coefficients βββ has as many components as the number ofhourly intervals. The predicted driver flows and passenger waiting times are robust going into18he future, since we demonstrate that they are not due to over-fitting of observed data from anoperational carpooling service. We have focused on the mathematically simpler case of pseudowaiting times. The Bayesian hierarchical framework that we have employed is able to generaliseto the more difficult, but more realistic case of perceived waiting times. Suppose that the pseudowaiting times for two consecutive passengers are w , w . Then the perceived waiting time of thesecond passenger is w ∗ = w + ( w − ζ | ( w > ζ )) with ζ = t − t . It would be intractable todeduce a closed form of the distribution of w ∗ . Since we are able to simulate from the conditionalposterior predictive distribution of the pseudo first waiting time w − ζ | ( w > ζ ) and from theunconditional second pseudo waiting time, then it is feasible to simulate the second perceivedwaiting time, assuming that these two components of w ∗ are independent. Since we our primarydata source are the GPS driver traces, we focused on modelling the driver arrival processes andassumed to the passenger arrivals to be non-random. In a Bayesian hierarchical framework, it isstraightforward to allow the passenger arrivals to also be a random process, and to analyse theresulting pseudo and perceived passenger waiting times.Finally we made the assumption that a driver embarks only one passenger at a time, whereasit is of intense operational interest for a carpooling provider to encourage different passengersto share a single carpooling ride, as maximising the occupancy rate in private vehicles is a keyobjective in the progress towards carbon-neutral societies. This passenger sharing probability isable to be analysed within the Bayesian hierarchical framework.For future works, we consider to construct an informative prior for new network of lines withsmall amount of available data. Another idea is to integrate the size and number of intervals S into the model, and find the optimal value with Bayesian inference. Acknowledgements
The authors thank Ecov for providing the data sets of driver GPS traces and passenger waitingtimes. The authors also thank Constant Bridon, Safa Fennia and Madeleine Zuber from Ecov,and Grard Biau from Sorbonne University for their feedback.
References [1] Bao, Y., F. Xiao, Z. Gao, and Z. Gao (2017). Investigation of the traffic congestion during publicholiday and the impact of the toll-exemption policy.
Transportation Research Part B 104 , 58–81.[2] Congdon, P. (2014).
Applied Bayesian Modelling . John Wiley & Sons.[3] Ding, A., X. Zhao, and L. Jiao (2002). Traffic flow time series prediction based on statistics learningtheory. In
Proceedings of the IEEE 5th International Conference on Intelligent Transportation Systems ,pp. 727–730.[4] Facebook Core Data Science Group (2019).
Forecasting at Scale . Facebook. Python package ver-sion 0.5. https://facebook.github.io/prophet . Stan model file https://github.com/facebook/prophet/blob/master/R/inst/stan/prophet.stan .[5] Gelman, A. and J. Hill (2006).
Data Analysis Using Regression and Multilevel/Hierarchical Models .Cambridge University Press.[6] Ghosh, B., B. Basu, and M. OMahony (2007). Bayesian time-series model for short-term traffic flowforecasting.
Journal of Transportation Engineering 133 , 180–189.[7] Gould, P. G., A. B. Koehler, J. K. Ord, R. D. Snyder, R. J. Hyndman, and F. Vahid-Araghi (2008).Forecasting time series with multiple seasonal patterns.
European Journal of Operational Research 191 ,207–222.[8] Hoffman, M. D. and A. Gelman (2014). The No-U-Turn sampler: adaptively setting path lengths inHamiltonian Monte Carlo.
Journal of Machine Learning Research 15 , 1593–1623.[9] Kung, K. S., K. Greco, S. Sobolevsky, and C. Ratti (2014). Exploring universal patterns in humanhome-work commuting from mobile phone data.
PLoS One 9 , e96180.[10] Makridakis, S., R. J. Hyndman, and F. Petropoulos (2020). Forecasting in social settings: the stateof the art.
International Journal of Forecasting 36 , 15–28.[11] Papoutsis, P., S. Fennia, C. Bridon, and T. Duong (2020). Relaxing door-to-door matching reduces assenger waiting times: a workflow for the analysis of driver GPS traces in a stochastic carpoolingservice. Submitted.[12] Ray, J.-B. (2014). Planning a real-time ridesharing network: critical mass and role of transfers. In Transport Research Arena (TRA) 5th Conference: Transport Solutions from Research to Deployment .IFSTTAR.[13] Stephanedes, Y., P. G. Michalopoulos, and R. A. Plum (1980). Improved estimation of traffic flowfor real-time control.
Transportation Research Record 795 , 28–39.[14] Taylor, S. J. and B. Letham (2018). Forecasting at scale.
The American Statistician 72 , 37–45., 37–45.