[PDF] Bayesian hierarchical models for the prediction of the driver flow and passenger waiting times in a stochastic carpooling service

Abstract

Carpooling is an integral component in smart carbon-neutral cities, in particular to facilitate homework commuting. We study an innovative carpooling service developed by the start-up Ecov which specialises in homework commutes in peri-urban and rural regions. When a passenger makes a carpooling request, a designated driver is not assigned as in a traditional carpooling service; rather the passenger waits for the first driver, from a population of non-professional drivers who are already en route, to arrive. We propose a two-stage Bayesian hierarchical model to overcome the considerable difficulties, due to the sparsely observed driver and passenger data from an embryonic stochastic carpooling service, to deliver high-quality predictions of driver flow and passenger waiting times. The first stage focuses on the driver flow, whose predictions are aggregated at the daily level to compensate the data sparsity. The second stage processes this single daily driver flow into sub-daily (e.g. hourly) predictions of the passenger waiting times. We demonstrate that our model mostly outperforms frequentist and non-hierarchical Bayesian methods for observed data from operational carpooling service in Lyon, France and we also validated our model on simulated data.

Full PDF

BBayesian hierarchical models for the prediction of the driver ﬂow andpassenger waiting times in a stochastic carpooling service

Panayotis Papoutsis a,b,c , Bertrand Michel a,b , Anne Philippe b and Tarn Duong c a Department of Computing and Mathematics, Nantes Central Engineering School, F-44300, France; b Jean Leray Mathematics Laboratory, University of Nantes, F-44300, France; c Ecov, F-44200, France

ARTICLE HISTORY

Compiled July 20, 2020

ABSTRACT

Carpooling is an integral component in smart carbon-neutral cities, in particular to facilitatehome-work commuting. We study an innovative carpooling service developed by the start-upEcov which specialises in home-work commutes in peri-urban and rural regions. When a passen-ger makes a carpooling request, a designated driver is not assigned as in a traditional carpoolingservice; rather the passenger waits for the ﬁrst driver, from a population of non-professionaldrivers who are already en route, to arrive. We propose a two-stage Bayesian hierarchical modelto overcome the considerable diﬃculties, due to the sparsely observed driver and passenger datafrom an embryonic stochastic carpooling service, to deliver high-quality predictions of driverﬂow and passenger waiting times. The ﬁrst stage focuses on the driver ﬂows, whose predictionsare aggregated at the daily level to compensate the data sparsity. The second stage processesthis single daily driver ﬂow into sub-daily (e.g. hourly) predictions of the passenger waitingtimes. We demonstrate that our model mostly outperforms frequentist and non-hierarchicalBayesian methods for observed data from operational carpooling service in Lyon, France andwe also validated our model on simulated data.

KEYWORDS

Hierarchical modelling; Gamma regression; GPS traces; MCMC; Multi-level moving average

1. Introduction

Providing ecologically sustainable transportation that is accessible for all is one of the key chal-lenges in the transition to post-carbon societies. An innovative solution devised by the Frenchstart-up, Ecov ( ), is a carpooling service for rural and urban peripheral regions.These regions are often neglected by the start-up sector as it tends to focus on technology-savvypopulations in dense urban regions. These carpooling services, whilst having been initiated inthe private sector, are developed and operated in close collaboration with local government au-thorities in order to satisfy the mobility requirements in these marginalised areas with sparserpopulation and physical/digital infrastructure. The key innovation brought to the market byEcov is the provision of carpooling lines, which closely resemble traditional bus lines. These car-pooling lines link physical meeting points between which carpooling is assured at suitable regu-larity. This concentrates the demand and the supply of carpooling to reach a critical mass morequickly and more sustainable. The meeting points are placed strategically in highly frequentedareas, which take into account various factors such as aggregated traﬃc ﬂow, socioeconomiccharacteristics, pedestrian accessibility, local government regulations, etc. Pick ups and drop-oﬀs at other locations than these meeting points are not facilitated by the provider, though theyare not disallowed, which ensure more ﬂexibility than bus lines. The meeting points resemblebus shelters, except where a passenger waiting for a bus usually only requires a simple handgesture to the driver to indicate that they intend to embark, the carpooling passenger must

CONTACT Panayotis Papoutsis. Email: [email protected] a r X i v : . [ s t a t . A P ] J u l ake an explicit carpooling request on an electronic console. This request is then displayedon a electronic sign on the roadside which informs all passing drivers of a passenger requestto a speciﬁed destination. This driver is not allocated in advance – this real-time, stochasticmatching between a passenger and driver, is a major distinguishing feature of Ecov carpoolingservices in contrast to their competitors (such as Uber, Lyft, Kapten etc). It is this stochasticmatching between a passenger and a ﬂow of potential drivers, along with the aggregating eﬀectsof the physical meeting points, that enable carpooling to reach economical feasibility in sparselypopulated regions.The stochastic matching from a mathematical and technological point-of-view is more diﬃcultthan the deterministic passenger-driver matching in order to provide a reliable waiting time ofa driver arrival. In the latter, a reliable waiting time for a passenger request requires only thetracking of an allocated driver, whereas stochastic matching requires both (a) the tracking ofmultiple potential drivers and (b) an understanding of the general driver ﬂow. The digital tech-nological infrastructure is key in delivering reliable waiting times to passengers. Ecov providesusers with a mobile phone application for their carpooling services: passengers receive updatesabout the waiting time for a driver arrival, and drivers receive notiﬁcations of passengers wait-ing at the meeting points, and crucially, are able to share their GPS locations in real-time withEcov. These driver GPS traces, by providing pertinent information, ensure the quality of thecarpooling service. This information includes the daily driver ﬂow and the passenger waitingtime, which we focus on in this paper.Due the complexity of the relationship between the driver GPS traces and the passenger wait-ing times, and the scarcity of the observed data due to the novelty of the stochastic carpooling,we propose a hierarchical approach where we ﬁrst build predictive models of the potential driverﬂow from the observed GPS driver traces. At the time when a passenger request is made, wedo not have a suﬃciently detailed knowledge of the instantaneous potential driver ﬂow, so wemodel this driver ﬂow ﬁrst as a moving average of previous driver ﬂows. Then we model thepassenger waiting time as a regression model with covariates based on the driver traﬃc ﬂowmodelled in the ﬁrst stage.In the ﬂowchart in Figure 1, our Bayesian multi-level hierarchical model is composed of twonested stages. The input data (driver GPS traces) are preprocessed, as outlined in [11], so thatthey are suitable as subsequent input into the hierarchical models themselves. The ﬁrst model isa multi-level moving average model whose coeﬃcients θθθ with levels depending on if the currenttype of day: working, weekend, public or school holiday. Bayesian multi-level models cruciallyare able to suitably model the driver ﬂows with these overlapping levels (e.g. the driver ﬂowfor public holiday which is also a school holiday is diﬀerent to that of public holidays outsideof the school holiday period). These levels are known to be highly inﬂuential [1].The outputfrom the ﬁrst hierarchical model is the daily driver ﬂow, which is the immediate input to thesecond hierarchical model. The latter is a Gamma regression, whose regression coeﬃcient βββ has S components for each of the time intervals into which a 24-hour period is divided. The role of βββ is to assign the daily traﬃc ﬂow to each of these sub-daily time intervals. The output of thissecond hierarchical model is the temporal proﬁle of the passenger waiting times for each ofthe sub-daily time intervals. The scarcity of the input data (driver GPS traces) only allows usto model the driver ﬂow robustly at a daily level, whereas a higher temporal resolution of theoutput passenger waiting times is required for a carpooling service. Bayesian hierarchical modelsoﬀer an intuitive treatment of these diﬀering temporal resolutions within a single workﬂow.In Section 2 the ﬁrst stage of the hierarchical model for the daily driver traﬃc ﬂow is de-scribed. In Section 3 the second stage of the hierarchical model for the passenger waiting timesis described. In Section 4 we validate our model on simulated data and then compare itselfperformance with frequentist and non-hierarchical Bayesian models on empirical data from anoperational carpooling service. We end the paper with a discussion and some future perspectives.2 nput: DriverGPS tracesData pre-processing Hierarchicalmodel 1:Multi-levelmoving average Output:Daily driverﬂow y Hierarchicalmodel 2:Gammaregression Output: Sub-daily passengerwaiting times

Model 1parameters: θθθ

Model 2parameters: βββ

Figure 1.

Flowchart of Bayesian hierarchical model for driver ﬂow and passenger waiting time prediction. The input data(driver GPS traces) are in grey, the hierarchical models in green, the model parameters in orange, and the model outputsin purple.

2. Bayesian multi-level moving averages of the daily driver ﬂow

As the driver ﬂow is a fundamental quantity in transportation research, its estimation/predictionis the subject of a vast ﬁeld of active research so we cite only those references with a directconnection with the analysis presented in this paper. Historically the simplest models are themoving window averages, see for instance [13]. More advanced methods draw from time seriesanalysis, within a frequentist [3] or a Bayesian framework [6] have been posited. Our proposedapproach of a Bayesian multi-level moving average is a combination of the approaches of [13]and [6] which combines the robustness and simplicity of moving averages, with the targetedadjustments of multi-level coeﬃcients. The empirical data in this paper are extracted fromthe Lane stochastic carpooling service ( lanemove.com ) operated by Ecov, in conjunction withInstant System ( instant-system.com ), since May 2018 in the south-eastern peri-urban regionsaround Lyon, France. See [11] for more details on its set-up. We focus on the driver GPS tracesfor the 382 days from 2018-05-15 (service launch) to 2019-05-31 (beginning of the following year’ssummer holiday season in France). The daily driver ﬂows in the Lane network are presented inFigure 2, where we enumerate each trajectory, rather than each unique driver. So a single drivercan make several trajectories within this time period. The colour coding is induced by the daytype, deﬁned as DT( i ) =  ORD if day i is an ordinary workdaySCH if day i is a school holidayPWE if day i is a public holiday or a weekend (1)where i is the index of the day from the service launch, i.e. i = 1 for 2018-05-15 etc. In Figure 2,the work days are in orange, the school holidays in green, and the public holidays/weekends inblue. The classic temporal cycles of driver ﬂow data in the left panel are present which indicatethat a moving average is relevant approach for prediction. According to [1, 9] these day typesare a key determinant of home-work daily commutes, which is veriﬁed by the box plots of thedaily driver ﬂow by day type in the right panel. The daily driver ﬂow for a work day approaches200 trajectories, which is about four times larger than driver ﬂow on school holidays, and morethan 10 times larger than on the public holiday/weekends. From visual inspection of the daily driver ﬂow in Figure 2 , a standard moving average whichignores these day types would be unable to account for the abrupt diﬀerences in driver ﬂowwhen consecutive days are of diﬀerent day types. The recurrence relation of the daily driver ﬂow3 igure 2.

Daily driver ﬂow on the Lane carpooling network, from 2018-05-15 to 2019-05-31. Left. Daily times series. Right.Aggregate driver ﬂow by day type. The ordinary work days (ORD) are in orange, the school holidays (SCH) in green andthe public holidays/weekend days (PWE) in blue. y i , on day i ≥ K ≥

1, satisﬁes y i = α DT( i ) K (cid:88) k =1 η DT( i − k ) y i − k + ε i (2)= α DT( i ) K (cid:88) k =1 [ { DT( i − k ) = ORD } + η SCH { DT( i − k ) = SCH } + η PWE { DT( i − k ) = PWE } ] y i − k + ε i where α DT( · ) is the coeﬃcient for the current day i and η DT( · ) are the coeﬃcients for the past K driver ﬂows and ε i , i = 1 , , . . . are a sequence of independent normal random variables N (0 , σ ε ).To ensure identiﬁability of η DT( · ) , without loss of generality, we set η ORD = 1 for all days. Toreduce the mathematical complexity, we assume that the ratios of the mean daily driver ﬂowof the three day types to each other (¯ y i, ORD : ¯ y i, SCH : ¯ y i, PWE ) remain constant for all days i inthe entire time period. The model in Equation (2) has a moving average structure of order K ,but with two additional multi-level coeﬃcients that make the average adaptive to the day typesfor the current day i and the previous K days. For example, if day i is a school holiday, thenthe right hand side of Equation (2) is α SCH (cid:80) Kk =1 [ { DT( i − k ) = ORD } + η SCH { DT( i − k ) =SCH } + η PWE { DT( i − k ) = PWE } ] y i − k . In the summand, the day type indicator functionsallows us to sum over the K previous days, even if they are of diﬀerent types. If a previousday is a work day, then its contribution to the current driver ﬂow is α SCH y i − k ; if a previousday is a school holiday then it is α SCH η SCH y i − k ; if a previous day is a public holiday/weekendthen it is α SCH η PWE y i − k . The ﬁrst multi-level coeﬃcient α SCH models the current driver ﬂow,conditionally on its day type. The second set of multi-level coeﬃcients η SCH , η

PWE re-scale theprevious driver ﬂows, assuming that the ratio of the ﬂows of diﬀerent day types is constant forall days.Our model in Equation (2) possesses a similar structure to an autoregressive model, though itdoes not strictly satisfy the deﬁnition of one. It cannot be deﬁned with a back shift operator dueto the action of the multi-level coeﬃcients α DT( · ) and η DT( · ) , and the process { y i , i = 1 , , . . . } isnon-stationary due to the drift in the driver participation rate after the launch of the carpoolingservice.The multi-level model in Equation (2) is equally valid for frequentist or Bayesian ap-proaches for parameter estimation. We adopt a Bayesian approach, in line with [5]. Let θθθ = ( α ORD , α

SCH , α

PWE , η

ORD , η

SCH , η

PWE , σ ε ) though recall that we ﬁx η ORD = 1 identically.Suppose that we have N days of observed daily driver ﬂows y i , i = 1 , . . . , N , where N > K ,the order of the moving average. Since the error variables are independent Gaussian, then the4onditional likelihood of yyy = ( y K , y K +1 , . . . , y N ) is L ( yyy | θθθ ) = 1(2 πσ ε ) ( N − K +1) / exp (cid:34) − σ ε N (cid:88) i = K ( y i − g i ( θθθ )) (cid:35) where g i ( θθθ ) = α DT( i ) (cid:80) Kk =1 η DT( i − k ) y i − k . This conditional likelihood is formed by the product ofthe conditional densities of y i given y i − K , . . . , y i − for i = K + 1 , . . . , N .In Bayesian analysis the parameter of interest θθθ is a random variable, and its prior distribution π represents our belief in its uncertainty. The posterior density π ( θθθ | yyy ) represents an update ofthe prior distribution by taking into account the observed data: π ( θθθ | yyy ) ∝ L ( yyy | θθθ ) π ( θθθ ).In our case, we do not have access to existing knowledge that would provide an informativeprior and thus we form a non-informative prior on θθθ , i.e. π ( θθθ ) ∝ σ − ε [2, Chapter 1]. This leadsto the following posterior distribution π ( θθθ | yyy ) ∝ L ( yyy | θθθ ) π ( θθθ ) ∝ σ N − K +1 ε exp (cid:34) − σ ε N (cid:88) i = K ( y i − g i ( θθθ )) (cid:35) . (3)For the inference on θθθ , Monte Carlo approximations are require since the posterior distribution(and its moments, quantiles etc.) cannot be calculated explicitly. The most widely used familyof methods is the Monte Carlo Markov Chain (MCMC) which aim to generate a Markov Chain { θθθ , θθθ , . . . } whose equilibrium distribution converges to the posterior distribution π ( θθθ | yyy ).The next stage is to predict a driver ﬂow ˜ y in the future from the observed past data yyy .Bayesian prediction is based on the posterior predictive distribution, that is the distribution of˜ y conditional on the observed past data yyy . Its density p (˜ y | yyy ) is given by p (˜ y | yyy ) = (cid:90) Θ p (˜ y | θθθ, yyy ) π ( θθθ | yyy ) dθθθ. (4)Since p (˜ y | yyy ) is a compound probability distribution, we can easily simulate samples from thispredictive distribution. We begin with deﬁning the daily driver ﬂow recurrence with no day types. So Equation (2) withday types simpliﬁes to y i = α K (cid:88) k =1 y i − k + ε i . Since the multi-level coeﬃcients for the day types are no longer present, this is indeed anautoregressive model. Algorithm 1 simulates a driver ﬂow for a single day with no day types.The inputs are the day i , the coeﬃcient α , the autoregression order K , and the error variance σ ε . The output is a single driver ﬂow for day i . The repeat loop ensures that the simulateddriver ﬂow is strictly positive. To simulate a sequence of N driver ﬂows, we initialise the valuesgenerated by Algorithm 1 for i = 1 , . . . , K days, and then iterate Algorithm 1 sequentially for i = K + 1 , . . . , N .With Algorithm 1 deﬁned, it is straightforward to deﬁne one with day types (i.e. Equation (2))in Algorithm 2. This has similar inputs the day i , the day type coeﬃcients θθθ , the autoregressionorder K , the error variance σ ε , and except that the scalar α is replaced with the vector coeﬃcients θθθ . The output is the daily driver ﬂow for day i , accounting for the day types of the days precedingday i . 5 lgorithm 1: Daily driver ﬂow without day types procedure TrafficFlow ( i, α, K, σ ε ) if i ¡= K then initialise y ←− N (30 , σ ε ) else repeat y ←− N ( α (cid:80) Kk =1 TrafficFlow ( i − k, α, K, σ ε ) , σ ε ) until y > end return : y driver ﬂow for day i Algorithm 2:

Daily driver ﬂow with day types procedure TrafficFlowDT ( i, θθθ, K, σ ε ) if DT( i ) == ORD then y ←− TrafficFlow ( i, α ORD , K, σ ε ) else if DT( i ) == SCH then y ←− TrafficFlow ( i, α SCH η SCH , K, σ ε ) else if DT( i ) == PWE then y ←− TrafficFlow ( i, α PWE η PWE , K, σ ε ) end end end return : y driver ﬂow for day i For the choice of an MCMC sampler, we use the NUT sampler [8]. The NUT sampler isused by default in the pyStan package ( https://pystan.readthedocs.io ), a Python interfaceto Stan ( https://mc-stan.org ), which is a state-of-art platform for Bayesian computations,amongst other functionalities.To carry out the complicated integration and then a random drawfrom the posterior predictive distribution of daily driver ﬂows p (˜ y | yyy ) in Equation (3), we are onlyrequired to input the prior π ( θθθ ), the likelihood L ( yyy | θθθ ) and the recurrence relation which generatesthe vector of simulated driver ﬂows yyy (i.e. Algorithm 2) into pyStan. The latter automaticallycalculates, for the j -th iteration, j = 1 , . . . , J , the vector of N replicates drawn from the posteriorpredictive distribution: ˜ yyy ( j ) =  y ( j, ... y ( j,N )  ∼  p (˜ y ( j, | yyy )... p (˜ y ( j,N ) | yyy )  . (5)The ﬁnal output is the sequence of posterior prediction vectors ˜ Y = { ˜ yyy (1) , . . . , ˜ yyy ( J ) } .

3. Bayesian Gamma regression of the passenger waiting times

From the perspective of a passenger in a carpooling service, a pertinent measure of the servicequality is the waiting time for a driver to arrive after the carpooling request is made. In astochastic matching carpooling service, the daily driver ﬂow is the predominant factor in deter-mining this waiting time, unlike for deterministic services where it plays a minor role. So theanalysis of driver ﬂow from the previous section plays an important role in passenger waiting6ime prediction, as illustrated in the ﬂowchart in Figure 1. Established methods for waitingtime prediction tend to be frequentist approaches based on Poisson driver arrivals, see [11, 12]which have encountered varying degrees of success. The aim of the section is to introduce moreaccurate Bayesian regression models which rely less on the Poissonian assumptions on the driverﬂows.For simplicity, we assume that a passenger can only make one request at a time for themselvesonly at a carpooling meeting point, and the drivers embark can only one passenger in their vehiclein the order that the passenger requests are made.For day i , let y i be the daily traﬃc ﬂow, and w i, , . . . , w i,n i be the waiting times for the n i passengers who make a carpooling request at time t i, < · · · < t i,n i respectively. Let t (cid:48) i,j be thedriver arrival times for the passenger request at time t i,j , i = 1 , . . . , N and j = 1 , . . . , n i . Theperceived waiting time for passenger request at time t i,j is w ∗ i,j = t (cid:48) i,j − t i,j and the pseudo waiting time is w i,j = t (cid:48) i,j − max( t i,j , t (cid:48) i,j − )with the convention t (cid:48) i, = t i, for the ﬁrst passenger on day i . Figure 3 illustrates the diﬀerencebetween the perceived and pseudo waiting time for the case of two passengers A, B who are bothnot the ﬁrst passenger of the day. Passenger A arrives ﬁrst and is the j -th, with j >

1, passengerof day i , and makes a carpooling request at time t i,j . Passenger B arrives immediately afterwardsand is ( j + 1)-th passenger with request time t i,j +1 . Suppose that there are at least two driversen route to embark these passengers, and who have not received any passenger requests beforepassenger A’s request. The ﬁrst driver arrives at t (cid:48) i,j > t i,j +1 (i.e. after passenger B ’s requesttime) and the second driver at t (cid:48) i,j +1 . The perceived waiting time for the passenger A is denoted w ∗ i,j = t (cid:48) i, j − t i,j (the blue brace in Figure 3) and for the passenger B is w ∗ i, +1 j = t (cid:48) i,j +1 − t i,j +1 (the green brace). The pseudo waiting for passenger A is w i,j = w ∗ i,j since they are at the frontof the queue, and for passenger B is w i,j +1 = t (cid:48) i,j +1 − t (cid:48) i,j is the grey brace. The pseudo waitingtime for passenger B is the diﬀerence between their departure and the departure of the previouspassenger A, and this is shorter than the perceived waiting time w ∗ i,j +1 . Figure 3.

Perceived and pseudo waiting times for the case of two passengers at a carpooling meeting point. PassengerA is at the head of the queue so their perceived waiting time (blue brace) coincides with their pseudo waiting time. Forpassenger B their pseudo waiting time (grey brace) is the diﬀerence between their departure and the departure of theprevious passenger A, which is shorter than their perceived waiting time (green brace).

We focus on the the pseudo waiting times rather than the perceived waiting times in ourmodel. From Figure 3, we observe that perceived waiting times w ∗ i,j and w ∗ i,j +1 for PassengersA and B overlap, whereas the pseudo waiting times w i,j = w ∗ i,j and w i,j +1 , by construction,7o not overlap. The overlapping nature of the interval processes that determine the perceivedwaiting times renders the problem of their prediction non-identiﬁable and that is why we haveintroduced the pseudo waiting times. Moreover, it is possible to predict the perceived waitingtimes from the pseudo waiting times, given known passengers behaviours, e.g. there is alreadya passenger waiting for a car since t minutes before the arrival of another passenger. Thus if‘waiting time’ is employed without any qualiﬁer, it is assumed to be the pseudo waiting time.Whilst the acquisition protocols for the driver GPS traces have been functioning well since thelaunch of the Lane carpooling service 2018-07-15, this was not the case for the passenger waitingtimes due to persistent technical operational diﬃculties for more than a year after the servicelaunch. This leads to a highly challenging situation in which to deliver robust passenger waitingtime predictions. We focus on the observed passenger pseudo waiting times covering the periodfrom 2019-07-25 to 2020-02-17. In Figure 4 are the 1500 observed passenger pseudo waiting timesin the Lane carpooling service, from 2019-10-22 to 2020-01-15 (we plot a sub-sample of the totalperiod for a better visualisation). This range of dates is diﬀerent from those for the driver GPStraces since, due to operational technical diﬃculties from the service launch on 2018-05-15 until2019-10-21, the passenger waiting times were not reliably recorded so they are excluded fromthe analysis. The operation of the Lane service is guaranteed only for work days (including someschool holidays), though this does not prevent passengers and drivers from using the service forother days, there are nonetheless far fewer carpooling requests for school holiday weekdays andthere are none for the public holidays/weekends. Figure 4.

Observed pseudo waiting times (in minutes) in the Lane carpooling network from 2019-10-22 to 2020-01-15.The ordinary work days (ORD) are in orange, the school holidays (SCH) in green.

In the previous section, we implemented estimations of the driver ﬂow at the daily level. For thewaiting times, we wish to formulate predictions at sub-daily resolution. Let the 24 hour periodof a day be divided into S equal intervals I < · · · < I S . The fraction of the daily driver ﬂow oneach interval I s , s = 1 , . . . , S is y i β s , where β s ≥ (cid:80) Ss =1 β s = 1. Conditional on the traﬃcﬂow y i and that passenger request times t i,j ∈ I s , we suppose that the pseudo waiting times w i,j are independent Gamma random variables with parameters ν and β s y i : w i,j | ( y i , βββ, t i,j ∈ I s ) ∼ Γ( ν, β s y i ) for i = 1 . . . N and j = 1 , . . . , n i . (6)This model speciﬁcation ensures that the conditional mean pseudo waiting time is E [ w i,j | ( y i , βββ, t i,j ∈ I s )] = νβ s y i which is consistent with our intuition of the inverse relationship between the driver ﬂow and thewaiting time. Since βββ is constant for all i , then the model assumes that the relative proportionsof the traﬃc ﬂow in the intervals I , . . . , I S remain unchanged for all driver ﬂow values.8n Figure 5 are the mean observed daily traﬃc ﬂows for each weekday from the Lane carpoolingservice, where the day is divided into 15 minute intervals ( S = 96). Since the service operatinghours are 06:00 – 09:00 and 16:00 – 19:00, there are few drivers outside them. Each dot in theﬁgure is the mean number of drivers for each 15 minute interval for each week day from 2018-05-15 to 2019-05-31. Each week day has a similar shape so this gives some empirical justiﬁcationfor supposing a constant βββ for all days. Figure 5.

Mean driver ﬂows for 15 minute interval for each weekday for the Lane carpooling service, from 2018-05-15 to2019-05-31. Monday is in blue, Tuesday in orange, Wednesday in green, Thursday in pink, Friday in violet.

A Dirichlet distribution is a natural choice as a prior distribution on the coeﬃcients βββ : βββ ∼ Dir(

S, ααα ) where ααα = ( α , . . . , α S ) are the concentration parameters, since it imposes theconstraint (cid:80) Ss =1 β s = 1 on the coeﬃcients.The corresponding Dirichlet density is p ( βββ ) = 1B( ααα ) S (cid:89) s =1 β α s − s where B ( ααα ) = (cid:81) Ss =1 Γ( α s ) (cid:14) Γ (cid:0) (cid:80) Ss =1 α s (cid:1) and Γ( x ) = (cid:82) ∞ u x − e − u du . Nonetheless, for the situa-tions where we cannot assure that the sum of the βββ coeﬃcients is always 1, then a non-informativeprior (i.e. the Lebesgue measure on R S + ) is preferred. The GPS driver traces collected by Ecovfrom the mobile application represent an incomplete subset of the complete driver populationof interest as they exclude (a) the drivers who are registered in the Lane carpooling service, butdo not share their geolocation with Ecov, and (b) the drivers who are currently not registeredbut are potential participants in the carpooling service. So in this case of an incomplete driverpopulation, the βββ coeﬃcients do not necessarily sum to 1.The βββ vector allows us to rebuild the temporal distribution of the traﬃc ﬂow within a dayfrom an aggregated daily driver ﬂow. In the cases when the driver ﬂow can be observed at asub-daily level, we still prefer to apply our multi-level moving average model in Equation (2)to predict a daily driver ﬂow as (i) it improves the robustness and (ii) it is straightforwardto change the temporal resolution I , . . . , I S of the waiting time predictions without having tore-generate the driver ﬂows.Let ttt i = ( t i, , . . . , t i,n i ) be the vector of the n i observed passenger carpooling request timesfor the day i ∈ { , . . . , N } , ttt = ( ttt , . . . , ttt N ) be all observed passenger carpooling request times,and likewise for the passenger pseudo waiting times i for day i , for all days. Also, let thevector yyy = ( y , . . . , y N ) be the observed driver ﬂows for all days. It is reasonable to assume thatthe waiting times are mutually independent conditionally to ( βββ, yyy, ttt ). The conditional likelihoodof the passenger waiting times is thus given by the joint density of given ( βββ, yyy, ttt ) L ( | βββ, yyy, ttt ) = N (cid:89) i =1 p ( i | yyy, ttt, βββ ) = N (cid:89) i =1 p ( i | y i , ttt i , βββ )9ince the conditional density of i given the driver ﬂow y i , passenger carpooling request times ttt i and the coeﬃcient βββ is p ( i | βββ, y i , ttt i ) = S (cid:89) s =1 (cid:89) { j : t i,j ∈ I s } ( β s y i ) ν Γ( ν ) w ν − i,j exp( − β s y i w i,j ) . Then we obtain the posterior density of βββ , using a non-informative prior on βββ , as π ( βββ | ) ∝ N (cid:89) i =1 S (cid:89) s =1 (cid:89) { j : t i,j ∈ I s } ( β s y i ) ν Γ( ν ) w ν − i,j exp( − β s y i w i,j ) { βββ ∈ R S + } . Let ˜ w s be the pseudo waiting time for a future, unobserved day for a passenger who makesa carpooling request in the time interval I s . If we observe a new daily driver ﬂow ˜ y , then theposterior predictive distribution of the waiting time ˜ w s is p ( ˜ w s | ˜ ) = (cid:90) p ( ˜ w s | ˜ y, β s ) π ( β s | ) dβ s . (7)If ˜ y is not observed, then the predictive distribution becomes p ( ˜ w s | ) = (cid:90) ∞ (cid:20)(cid:90) p ( ˜ w s | ˜ y, β s ) π ( β s | ) dβ s (cid:21) p (˜ y | yyy ) d ˜ y (8)where p (˜ y | yyy ) is deﬁned in Equation (4). Equation (7) applies when we wish to make a predictionfor the current day where we have observed a driver ﬂow, and Equation (8) for a future daywhere we have not yet observed the driver ﬂow. Finally we wish to predict the waiting time forall the time intervals I , . . . , I S so we collate them into an S -vector( p ( ˜ w | ˜ ) , . . . , p ( ˜ w S | ˜ )), or analogously with p ( ˜ w s | ) , s = 1 , . . . , S . Algorithm 3 simulates the passenger pseudo waiting times in Equation (6) for a sequence of days.The inputs are the number of days N , the day types coeﬃcients θθθ , the autoregression order K ,the error variance σ ε , the ﬁrst shape parameter for the Gamma distribution ν , the S regressionparameters βββ , and the number of replicates of waiting times J . The output are J replicates ofa pseudo waiting times for each time interval I s , s = 1 , . . . , S , for each day i = 1 , . . . , N . The TrafficFlowDT procedure (Algorithm 2) is called outside of the replicates loop so that eachday has one driver ﬂow, and all waiting times are simulated from this same daily driver ﬂow.An iteration of the nested loop in Algorithm 3 results in a single N × S matrix of pseudowaiting times drawn from the appropriate Gamma distributions: WWW ( j ) ∼  Γ( ν, β y ) . . . Γ( ν, β S y )... ...Γ( ν, β y N ) . . . Γ( ν, β S y N )  . These are then iterated J times and collated into the sequence to be the output from Algorithm 3 W = { WWW (1) , . . . , WWW ( J ) } =  w (1)1 , . . . w (1)1 ,S ... ... w (1) N, . . . w (1) N,S  , . . . ,  w ( J )1 , . . . w ( J )1 ,S ... ... w ( J ) N, . . . w ( J ) N,S  . lgorithm 3: Passenger pseudo waiting times procedure WaitingTime ( N, θθθ, K, σ ε , ν, βββ, J ) S ←− Len ( βββ ) for i in do Y [ i ] ←− TrafficFlowDT ( i, θθθ, K, σ ε ) end for j in do for i in do for s in do WWW ( j ) [ i, s ] ←− Γ( ν, β s Y [ i ]) end end end return : WWW (1) , . . . , WWW ( J ) sequence of waiting time matricesAs Equation (7) generates only a single posterior prediction ˜ w s for a time interval I s , wecollate these ˜ w s for s = 1 , . . . S into an S -vector, and in turn collate N of these S -vectors ofposterior prediction distributions row-wise into a N × S matrix. To carry out the complicatedintegration and then a random draw from the posterior predictive distribution of p ( ˜ w s | ˜ )in Equation (7), we are only required to input the posterior predicted value of the driver ﬂow˜ y (Equation (4)), the recurrence relation which generates the vector of simulated driver ﬂows yyy (Algorithm 2), and the recurrence relation which generates the vector of simulated passengerpseudo waiting times (Algorithm 3) into pyStan. The latter automatically simulates from this N × S matrix distribution:˜ WWW ( j ) =  ˜ w ( j )1 , . . . ˜ w ( j )1 ,S ... ...˜ w ( j )1 ,N . . . ˜ w ( j ) N,S  ∼  p ( ˜ w ( j, | ˜ ) . . . p ( ˜ w ( j, S | ˜ )... ... p ( ˜ w ( j,N )1 | ˜ ) . . . p ( ˜ w ( j,N ) S | ˜ )  for the j -th iteration, j = 1 , . . . , J . The sequence of the matrices of replicated posterior predic-tions is ˜ W = { ˜ W ˜ W ˜ W (1) , . . . , ˜ W ˜ W ˜ W ( J ) ) } .

4. Model validation

Since the Lane carpooling data set dates from 2018, for the simulations, we set the initial day i = 1 to be 2018-01-01, and the work days (ORD), school (SCH) and public holidays/weekends(PWE) to be those observed in Lyon, France. For the simulation algorithms, the parametersare: the number of days is N = 365, the day types coeﬃcients is θθθ = (0 . , . , . , , , K = 3, the error variance is σ ε = 5, the 24 hour period is dividedin S = 8 equal intervals of 3 hours, the ﬁrst Gamma shape parameter is ν = 7, the Gammaregression parameters are βββ = (0.012, 0.01, 0.011, 0.013, 0.018, 0.016, 0.017, 0.019), and thenumber of replicates is J = 10 which corresponds to the number of observed waiting times pertime interval. These parameter values produce simulated data which is comparable to thoseobserved in the Lane carpooling service.We generate one simulated data set of N = 365 days, each with one daily driver ﬂow y i , i =1 , . . . , N (Algorithm 2), and J = 10 passenger pseudo waiting time N × S matrices W = { WWW (1) , . . . , WWW ( J ) } (Algorithm 3), and the corresponding N × S posterior prediction matricesto form ˜ W = { ˜ WWW (1) , . . . , ˜ WWW ( J ) } . The data from these N = 365 days from 2018-01-01 to 2018-112-31 form the reference training data set. With the same parameters, we simulate a further˜ N = 5 days (2019-01-01 to 2019-01-05) of the data, which forms the oracle test data set ofthe ˜ N × S matrices to form W test = { WWW (1)test , . . . , WWW ( J )test } . Furthermore, from the training dataonly (i.e. we do not take into account the W test ), for these same extra ˜ N days, we generatethe corresponding ˜ N × S posterior prediction matrices to form ˜ W test = { ˜ WWW (1)test , . . . , ˜ WWW ( J )test } . Forbrevity we have omitted the equivalent comparison of the driver ﬂows and focus on the passengerwaiting times for these simulated data: we make a more thorough comparison of both driverﬂows and passenger waiting times for the empirical data in the sequel. Whilst the daily traﬃcﬂows can be summarised by a single scalar, for the passenger waiting times, we focus on thetemporal proﬁles, over the S = 8 periods of a day, of the waiting times. In Figure 6 are thequantiles of the waiting times for all time intervals I , . . . , I S , for all days i = 1 , . . . , ˜ N in thetest phase. The grey box plots are of W test ,i,s and the light, medium and dark purple circlessuperimposed over the box plots are the 50%, 75%, 95% quantiles of ˜ W test ,i,s . Recall that foroperational purposes of the Lane carpooling service, short term prediction for the coming week issuﬃcient. This is veriﬁed by the close of the quantiles of the posterior predicted pseudo waitingtimes with their observed values for all ˜ N = 5 prediction days.From a passenger point of view, whilst the magnitude of waiting time is important as aperception of the service quality, it is equally important that these posterior predicted waitingtimes be as close to the observed ones, whatever their magnitude. For example, suppose that adriver arrives after 12 minutes a passenger makes a carpooling request. In this case, a predictionof 15 minutes is better than 5 minutes since the former is closer to, but longer than, the ob-served waiting time than the latter. Therefore we propose the following metric to measure thesediscrepancies for a given threshold δ :PE( p ( W test ) , p ( ˜ W test ); δ ) = 1 J ˜ N S ˜ N (cid:88) i =1 S (cid:88) s =1 J (cid:88) j =1 {| ¯˜ w test ,i,s − w ( j )test ,i,s | < δ } (9)where ¯˜ w test ,i,s = J (cid:80) Jj =1 ˜ w ( j,i )test ,s is the mean of the posterior predicted waiting times distributionfor day i, i = 1 , . . . , ˜ N and time interval I s , s = 1 , . . . , S . This metric, as a function of δ , illustratedin Figure 7, during both the training phase PE( p ( W ) , p ( ˜ W ); δ ) (blue curve) and the test phasePE( p ( W test ) , p ( ˜ W test ); δ ) (red curve). The test predictions are more accurate than the trainingpredictions for small values of δ < δ between 2 and 8 minutes, and after 8 minutes, both curves leveloﬀ at 1. Thus the posterior predictions from our proposed Bayesian hierarchical model can haverobust prediction performance. Our objective is the employ the two-stage Bayesian hierarchical model (Algorithms 2 and 3) topredict a passenger pseudo waiting time (distribution) for hourly intervals I s , s = 1 , . . . , S , with S = 24. For the Lane carpooling service, it is suﬃcient to provide the upcoming week’s predic-tions at the beginning of the week. The predicted daily driver ﬂows from the ﬁrst hierarchicalmodel are input into the second hierarchical model to produce predicted passenger pseudo wait-ing times. The latter are then compared to the observed pseudo waiting times of the passengercarpooling requests from the same period. We have the GPS traces (approximately 5 000 traces) for the 382 days from 2018-05-15 (servicelaunch) to 2019-05-31 (beginning of following summer holiday period), which we divide intodiﬀerent training and test data sets of varying sizes depending the objectives of the analysis. Weﬁrst apply the preprocessing, as outlined in [11], to the driver GPS traces to the convert into12 igure 6.

Predictions of pseudo waiting times for 3-hourly intervals, for all ˜ N prediction days. The observed waiting times(Observed PWT) are the grey box plots, and the 50%, 75%, 95% quantiles of the posterior predicted waiting times (PPPWT) are the light, medium and dark purple circles. igure 7. Evolution of the PE metric of the observed and posterior predicted daily driver ﬂows, as a function of thethreshold δ . The blue curve is for the training phase, and the red curve for the test phase. data format suitable for computing the daily driver ﬂows y train ,i , i = 1 , . . . , N for the trainingand y test ,i , i = 1 , . . . , ˜ N for the test phases. We vary N whilst maintaining ˜ N = 7 to test variousscenarios in diﬀerent periods of the year. To investigate the prediction accuracy of our proposedmodels, we divide our complete data set of into 6 diﬀerent pairs of training phases with varying N (starting from 2018-05-15) and test phases with ˜ N = 7 always. In each case we select a testweek with certain characteristics as outlined in Table 1. The ﬁrst column are the dates (inclusive)of the test week, the second column are the day types in the test week, the third column are arethe dates of the training weeks starting from 2018-05-15 to the previous day of the test week,and the fourth column is the number of training days ( N ). Table 1.

Training-test scenarios for daily driver ﬂows. The ﬁrst column are the dates of the test week ( ˜ N = 7), the secondis the day types in the test week, the third are the dates of the training weeks and the fourth column is the number oftraining days N . Test week Test week day types Training weeks N ) In addition to our proposed Bayesian hierarchical multi-level (BHML) predictions, we com-pute predictions from two other models: baseline frequentist (BASE) and the Bayesian Prophetmodel (PROP). The baseline frequentist model has multi-levels like our BHML, but without theBayesian moving average structure. To account for the for public/school holidays, as proposedby [7], if day i is not a school/public holiday then the average is calculated over all previousdays with the same day of week as day i ; and if day i is a public holiday, then the average isover all previous public holidays. That is, y i = 1 | T d ( i ) | (cid:88) k ∈ T d ( i ) y i − k { DT (cid:48) ( i ) (cid:54) = HOL } + 1 | T HOL ( i ) | (cid:88) k ∈ T HOL ( i ) y i − k { DT (cid:48) ( i ) = HOL } + ε i (10)where we collapse the day types function DT to DT (cid:48) ( i ) = ORW if i is an ordinary work day or aweekend day, DT (cid:48) ( i ) = HOL if i is a school or public holiday; DN is day of week number function,DN( i ) = 1 if i is a Monday, DN( i ) = 2 if i is a Tuesday etc; and T d ( i ) = { k : k < i, DN( i − k ) = d } is the set of prior days with the same day of week as day i , and T HOL ( i ) = { k : k < i, DT (cid:48) ( i − k ) =HOL } is set of public holidays before day i . 14he Bayesian Prophet model, devised by [4, 14], is an additive model with three components: y i = g ( i ) + s ( i ) + h ( i ) + ε i (11)where g ( i ) is the trend, s ( i ) is the seasonality, and h ( i ) is the holidays eﬀect. The linear trendis g ( i ) = ( k + aaa ( i ) (cid:62) δδδ ) i + ( m + aaa ( i ) (cid:62) γγγ ) where k is the growth rate, m is the oﬀset, aaa is thechange point indicator, δδδ is the growth rate adjustment, and γγγ is the piece-wise continuityadjustment to ensure that g is continuous. The seasonality component is a Fourier decomposition s ( i ) = (cid:80) L(cid:96) =1 [ α (cid:96) cos(2 π(cid:96)i/P ) + β (cid:96) sin(2 π(cid:96)i/P )] where ( α (cid:96) , β (cid:96) ) are the Fourier coeﬃcients, L is thenumber of Fourier coeﬃcients and P is the period (in days). The holiday eﬀect is h ( i ) = hhh ( i ) (cid:62) κκκ where, say, hhh ( i ) = ( { DT( i ) = SCH } , { DT( i ) = PWE } ) is the vector of indicator variables ofthe type of holiday of day i , and κκκ is the weight vector, usually equal to the all-ones vector.[14] provide the details for the construction of the change point function aaa ( t ) and the continuityadjustment parameter γγγ . These authors set the number of Fourier coeﬃcients to be L = 10 foryearly cycles and L = 3 for weekly cycles. What remains is to estimate the trend growth rate k ,the oﬀset m , the growth rate adjustments δδδ and the Fourier coeﬃcients ααα .For the training phase of dates 2018-05-15 to 2019-05-19 (Test scenario y i from Bayesian hierarchical multi-level model BHML, aswell the corresponding predictions/estimations from the frequentist baseline model BASE andthe Bayesian Prophet model PROP. In Figure 8 is the evolution of the goodness-of-ﬁt of thethree diﬀerent models for daily driver ﬂow estimation (leaving out the ﬁrst week 2019-05-15 to2019-05-22 which serves as the ‘burn-in’ period). The goodness-of-ﬁt is measured by the MSEof the estimated and the observed daily driver ﬂows, aggregated per week. Visually the BHMLtends to have the best goodness-of-ﬁt (smallest MSE) for most weeks. The sum of these weeklyMSEs are: BASE: 421.9, PROP: 816.9, BHML: 297.2, which conﬁrms our visual impression thatthe BHML achieves the best overall estimation accuracy. Figure 8.

Evolution of the goodness-of-ﬁt of the daily driver ﬂow estimations over the training period (2018-05-15 to2019-05-19, test scenario

We can be conﬁdent that the Bayesian hierarchical multi-level moving average model has goodestimation accuracy/goodness-of-ﬁt, but this good performance does not necessarily translate toprediction. This non-transitivity of estimation and prediction performance is discussed in [10].So for each scenario described in Table 1, we compute the BHML, BASE and PROP modelsfor the training phase, and then the days of the test phase are input into each these models toyield the daily driver ﬂow predictions. These predictions are presented in Figure 9: the Bayesianhierarchical multi-level BHML in purple, the frequentist baseline model BASE are in black, theBayesian Prophet PROP in green; and the observed daily driver ﬂows are in blue. The PROPpredictions are mostly too low on week days and too high on weekends for all six test weeks incomparison to the observed driver ﬂows, whilst the BASE and BHML appear to have comparableperformance.In Figure 10 are the MSEs between the observed and predicted daily driver ﬂows: the fre-quentist baseline model are in black, the Bayesian Prophet PROP in green, and the Bayesianhierarchical multi-level BHML in purple. PROP is the uniformly the worst of these three models15 igure 9.

Predictions of daily driver ﬂows for the six test week scenarios. Observed daily driver ﬂows are in blue. Bayesianhierarchical multi-level BHML are in purple, frequentist baseline BASE in black, and Bayesian Prophet PROP in green. for all test weeks. BASE is the best for test scenario

For the case study for simulated data in Section 4.1, we could generate the oracle simulatedtemporal proﬁles to which the posterior predicted proﬁles could be compared. For the Lanecarpooling service, since it is still in an embryonic phase of operation, there are insuﬃcientpassenger carpooling requests to robustly compute observed temporal proﬁles over an entireday,especially for school holidays (SCH) and public holidays/weekends (PWE) as shown inFigure 4. So it is only possible to form predictions for weekdays (ORD). In Figure 11 are thebox plots of the weekly number of observed pseudo waiting times for each hourly interval forweekdays from 2019-07-25 to 2020-02-17, but the eﬀective end date is 2019-05-15 since the lasttwo days are weekend days. Although there are in total S = 24 hourly intervals, only those 6which correspond to the operating hours of the Lane service (06:00 – 09:00 and 16:00 – 19:00)16 igure 10. Prediction MSE of the daily driver ﬂow predictions for the six test week scenarios. Bayesian hierarchicalmulti-level BHML are in purple, frequentist baseline BASE in black, and Bayesian Prophet PROP in green. contain any observed passenger waiting times.

Figure 11.

The box plots of the weekly number of observed pseudo waiting times for each hourly interval for weekdaysfrom 2019-07-25 to 2020-02-17.

There are a maximum of around 50-60 observed waiting times per hourly interval per week,which are not suﬃcient to infer robustly their distribution within each interval. To remedythis data sparsity, we aggregate a moving window of test data so for time interval I s day on i , we combine its observed pseudo waiting times w test ,i,s with those for the same time intervalfrom the previous 5 weeks with the same day of week DN( i ) and same day type DT( i ), i.e. { w test ,i − k,s : DT( i − k ) = DT( i ) , DN( i − k ) = DN( i ) , k = 1 , . . . , } . These days added to thetest data are correspondingly removed from the training data. If we aggregate the ﬁnal 5 weeksto be a single test phase, then the training-test scenario is outlined in Table 2. Thus we makepredictions for only the last test week (2020-02-10 – 2020-02-17), so the number of predictionweekdays remains ˜ N = 5. Table 2.

Training-test scenario for passenger waiting times. The ﬁrst column are the dates of the test weeks , the secondis the number of observed passenger waiting times in the test weeks, the third are the dates of the training weeks and thefourth column is the number of observed passenger waiting times in the training weeks.

Test weeks igure 12.

Predictions of passenger pseudo waiting times for hourly time intervals. The observed waiting times are thegrey box plots, and the 50%, 75%, 95% quantiles of the posterior predicted waiting times are the light, medium and darkpurple circles.

Lastly we consider our custom PE metric on the BHML posterior predictions:PE( p ( W ) , p ( ˜ W ); δ ) = 1˜ N S ˜ N (cid:88) i =1 S (cid:88) s =1 (cid:88) {| ¯˜ w i,s − w i,s | < δ } (12)where ¯˜ w i,s = J (cid:80) Jj =1 p ( w ( j,i ) s | ˜ ) is the mean of the posterior predicted waiting times distri-bution for day i and time interval I s . This metric, as a function of the threshold δ , illustratedin Figure 13, during both the training phase PE( p ( W ) , p ( ˜ W ); δ ) (blue curve) and the test phasePE( p ( W test ) , p ( ˜ W test ); δ ) (red curve). This is in contrast to the conclusion from Section 4.1, sincethe red curve dominates the blue curve which implies that the posterior predictions are moreaccurate during the test phase than in the training phase. This gives us conﬁdence that theBHML posterior predictions are robust and are not based on over-ﬁtting on the training data. Figure 13.

Evolution of the PE metric of observed and BHML posterior predicted passenger pseudo waiting times, as afunction of the threshold δ . The blue curve is for the training phase, and the red curve for the test phase.

5. Conclusion

The main contribution of this paper is the transformation of daily driver ﬂows to passengerwaiting times for hourly intervals using a nested two-stage Bayesian hierarchical model for anoperational carpooling service. The ﬁrst stage is a multi-level moving average model of the dailydriver ﬂows, whose coeﬃcient θθθ with levels depending on if the current day is a work day, aschool holiday or a public holiday/weekend. The second stage is a Gamma regression whoseresponse variables are the hourly passenger waiting times, covariates are the daily drive ﬂowsfrom the ﬁrst stage, and regression coeﬃcients βββ has as many components as the number ofhourly intervals. The predicted driver ﬂows and passenger waiting times are robust going into18he future, since we demonstrate that they are not due to over-ﬁtting of observed data from anoperational carpooling service. We have focused on the mathematically simpler case of pseudowaiting times. The Bayesian hierarchical framework that we have employed is able to generaliseto the more diﬃcult, but more realistic case of perceived waiting times. Suppose that the pseudowaiting times for two consecutive passengers are w , w . Then the perceived waiting time of thesecond passenger is w ∗ = w + ( w − ζ | ( w > ζ )) with ζ = t − t . It would be intractable todeduce a closed form of the distribution of w ∗ . Since we are able to simulate from the conditionalposterior predictive distribution of the pseudo ﬁrst waiting time w − ζ | ( w > ζ ) and from theunconditional second pseudo waiting time, then it is feasible to simulate the second perceivedwaiting time, assuming that these two components of w ∗ are independent. Since we our primarydata source are the GPS driver traces, we focused on modelling the driver arrival processes andassumed to the passenger arrivals to be non-random. In a Bayesian hierarchical framework, it isstraightforward to allow the passenger arrivals to also be a random process, and to analyse theresulting pseudo and perceived passenger waiting times.Finally we made the assumption that a driver embarks only one passenger at a time, whereasit is of intense operational interest for a carpooling provider to encourage diﬀerent passengersto share a single carpooling ride, as maximising the occupancy rate in private vehicles is a keyobjective in the progress towards carbon-neutral societies. This passenger sharing probability isable to be analysed within the Bayesian hierarchical framework.For future works, we consider to construct an informative prior for new network of lines withsmall amount of available data. Another idea is to integrate the size and number of intervals S into the model, and ﬁnd the optimal value with Bayesian inference. Acknowledgements

The authors thank Ecov for providing the data sets of driver GPS traces and passenger waitingtimes. The authors also thank Constant Bridon, Safa Fennia and Madeleine Zuber from Ecov,and Grard Biau from Sorbonne University for their feedback.

References [1] Bao, Y., F. Xiao, Z. Gao, and Z. Gao (2017). Investigation of the traﬃc congestion during publicholiday and the impact of the toll-exemption policy.

Transportation Research Part B 104 , 58–81.[2] Congdon, P. (2014).

Applied Bayesian Modelling . John Wiley & Sons.[3] Ding, A., X. Zhao, and L. Jiao (2002). Traﬃc ﬂow time series prediction based on statistics learningtheory. In

Proceedings of the IEEE 5th International Conference on Intelligent Transportation Systems ,pp. 727–730.[4] Facebook Core Data Science Group (2019).

Forecasting at Scale . Facebook. Python package ver-sion 0.5. https://facebook.github.io/prophet . Stan model ﬁle https://github.com/facebook/prophet/blob/master/R/inst/stan/prophet.stan .[5] Gelman, A. and J. Hill (2006).

Data Analysis Using Regression and Multilevel/Hierarchical Models .Cambridge University Press.[6] Ghosh, B., B. Basu, and M. OMahony (2007). Bayesian time-series model for short-term traﬃc ﬂowforecasting.

Journal of Transportation Engineering 133 , 180–189.[7] Gould, P. G., A. B. Koehler, J. K. Ord, R. D. Snyder, R. J. Hyndman, and F. Vahid-Araghi (2008).Forecasting time series with multiple seasonal patterns.

European Journal of Operational Research 191 ,207–222.[8] Hoﬀman, M. D. and A. Gelman (2014). The No-U-Turn sampler: adaptively setting path lengths inHamiltonian Monte Carlo.

Journal of Machine Learning Research 15 , 1593–1623.[9] Kung, K. S., K. Greco, S. Sobolevsky, and C. Ratti (2014). Exploring universal patterns in humanhome-work commuting from mobile phone data.

PLoS One 9 , e96180.[10] Makridakis, S., R. J. Hyndman, and F. Petropoulos (2020). Forecasting in social settings: the stateof the art.

International Journal of Forecasting 36 , 15–28.[11] Papoutsis, P., S. Fennia, C. Bridon, and T. Duong (2020). Relaxing door-to-door matching reduces assenger waiting times: a workﬂow for the analysis of driver GPS traces in a stochastic carpoolingservice. Submitted.[12] Ray, J.-B. (2014). Planning a real-time ridesharing network: critical mass and role of transfers. In Transport Research Arena (TRA) 5th Conference: Transport Solutions from Research to Deployment .IFSTTAR.[13] Stephanedes, Y., P. G. Michalopoulos, and R. A. Plum (1980). Improved estimation of traﬃc ﬂowfor real-time control.

Transportation Research Record 795 , 28–39.[14] Taylor, S. J. and B. Letham (2018). Forecasting at scale.

The American Statistician 72 , 37–45., 37–45.