Behavior near walls in the mean-field approach to crowd dynamics
BBehavior near walls in the mean-field approach tocrowd dynamics ∗ Alexander Aurell † Boualem Djehiche ‡ March 9, 2020
Abstract
This paper introduces a system of stochastic differential equations(SDE) of mean-field type that models pedestrian motion. The systemlets the pedestrians spend time at, and move along, walls, by means ofsticky boundaries and boundary diffusion. As an alternative to Neumann-type boundary conditions, sticky boundaries and boundary diffusion havea ’smoothing’ effect on pedestrian motion. When these effects are active,the pedestrian paths are semimartingales with first-variation part abso-lutely continuous with respect to the Lebesgue measure dt , rather thanan increasing processes (which in general induces a measure singular withrespect to dt ) as is the case under Neumann boundary conditions. Weshow that the proposed mean-field model for pedestrian motion admits aunique weak solution and that it is possible to control the system in theweak sense, using a Pontryagin-type maximum principle. We also relatethe mean-field type control problem to the social cost minimization in aninteracting particle system. We study the novel model features numer-ically and we confirm empirical findings on pedestrian crowd motion incongested corridors. MSC 2010: 49N90, 60H10, 60K35, 93E20Keywords: pedestrian crowd modeling; mean-field type control;sticky boundary conditions; boundary diffusion
Models for pedestrian motion in confined domains must consider interactionwith solid obstacles such as pillars and walls. The pedestrian response to a ∗ Financial support from the Swedish Research Council (2016-04086) is gratefully acknowl-edged. We thank the anonymous reviewers for comments and suggestions that greatly helpedto improve the presentation of the results. † A. Aurell is with the Department of Mathematics, KTH Royal Institute of Technology,SE-100 44 Stockholm, Sweden [email protected] ‡ B. Djehiche is with the Department of Mathematics, KTH Royal Institute of Technology,SE-100 44 Stockholm, Sweden [email protected] a r X i v : . [ m a t h . O C ] M a r estriction of movement has been included into crowd models either as bound-ary conditions or repulsive forces. Up until today, the Neumann condition andits variants (e.g. no-flux) have been especially popular among the boundaryconditions. The Neumann condition suffers from a drawback related to its mi-croscopic (pathwise) interpretation. A Neumann condition on the crowd densitycorresponds to pedestrian paths reflecting in the boundary. In reality, pedes-trians do not bounce off walls in the manner of classical Newtonian particles,but their movement is slowed down by the impact and a positive amount oftime is needed to choose a new direction of motion. It is natural to think thatwhenever a pedestrian is forced (or decides) to make contact with a wall, shestays there for some time. During this time, she can move and interact withother pedestrians, before re-entering the interior of the domain. Today there is more than one conventional approach to the mathematical model-ing of pedestrian motion. This section aims to summarize how they incorporatethe interaction between pedestrians and walls.Microscopic force-based models, among which the social force model hasgained the most attention, describes pedestrians as Newton-like particles. Fromthe initial work [30] and onward, the influence a wall has on the pedestrian ismodelled as a repulsive force. The shape of the corresponding potential hasbeen studied experimentally, for example in [40]. The cellular automata isanother widely used microscopic approach to pedestrian crowd modeling. Wallsare modeled as cells to which pedestrians cannot transition, already the originalwork [37] considers this viewpoint. In the continuum limits of cellular automata,as for example in [14, 13], boundary conditions are often set to no-flux conditionsof the same type as (1) below.The focus of macroscopic models is the global pedestrian density, either in astationary or a dynamic regime. Inspired by fluid dynamics [33] treats the crowdas a ’thinking fluid’ that moves at maximum speed towards a target locationwhile taking environmental factors into account, such as the congestion of thecrowd. In this category of models, boundary conditions at impenetrable wallsare most often implemented as Neumann conditions for the pedestrian density.The pathwise interpretation of a Neumann boundary condition is instantaneousreflection. A nonlocal projection of pedestrian velocity in normal and tangentialdirection of the boundary respectively is suggested in [6] and implemented in[7], allowing for nonlocal interaction with boundaries.Mean-field games and mean-field type control/games are macroscopic mod-els of rational pedestrians with the ability to anticipate crowd movement, andadapt accordingly. These models can capture competition between individualsas well as crowd/sub-crowd cooperation. In the mean-field approach to pedes-trian crowd modeling pedestrian-to-pedestrian interaction is assumed to be sym-metric and weak, thus plausibly replaced by an interaction with a mean field(typically a functional of the pedestrian density). One of the most attractivefeatures of the mean-field approach is that it connects the macroscopic (pedes-2rian density) and the microscopic (pedestrian path) point-of-view, typicallythrough results on the near-optimality/equilibrium of mean-field optimal con-trols/equilibria. The connection permits us to infer individual pedestrian behav-ior from crowd density simulations, and vice versa. In what follows, the crowddensity is denoted by m . In [38], the density is subjected to n ( x ) · ∇ m ( t, x ) = 0at walls, where n ( x ) is the outward normal at x . Under this constraint, thenormal velocity of the pedestrian is zero at any wall. Taking conservation ofprobability mass into account, [10] derives the following boundary condition − n ( x ) · ( ∇ m ( t, x ) − G ( m ) v ( t, x )) = 0 , (1)where G ( m ) v is a general form of the pedestrian velocity. The constraint (1)represents reflection at the boundary since in the corresponding microscopicinterpretation pedestrians make a classical Newtonian bounce whenever theyhit the boundary. The same type of constraint is used in [2]. The case of severalinteracting populations in a bounded domain with reflecting boundaries hasbeen studied in the stationary and dynamic case [17, 1, 5]. In these papers, thecrowd density at walls is constrained by n ( x ) · ( ∇ m ( t, x ) + m ( t, x ) ∂ p H ( x, ∇ u )) = 0 . (2)The constraint is a reflection and the term − ∂ p H ( x, ∇ u ) is the velocity of pedes-trians that use the mean-field equilibrium strategy. The sticky reflected Brownian motion was discovered by Feller [23, 24, 25]. Hestudied the infinitesimal generator of strong Markov processes on [0 , ∞ ) thatbehave like Brownian motion in (0 , ∞ ), and showed that it is possible for theprocess to be ’sticky’ on the boundary, i.e. to sojourn at 0. So ’sticky reflection’was appended to the list of boundary conditions for diffusions, which alreadyincluded instantaneous reflection, absorption, and the elastic Robin condition.Wentzell [44] extended the result to more general domains.Itˆo and McKean [34] constructed sample paths to the one-dimensional stickyreflected Brownian motion dX t = 2 µ { X t =0 } dt + 1 { X t > } dW t , µ > , (3)whose infinitesimal generator is the one studied by Feller. Skorokhod conjec-tured that the sticky reflected Brownian motion has no strong solution. A proofthat (3) has a unique weak solution can be found in for example [46, IV.7].Chitashvili published the technical report [15] in 1989 claiming a proof ofSkorokhod’s conjecture. Around that time, the process was studied by severalauthors, e.g. [29, 26, 3, 47], to name a few. Warren [45] provided a proof ofSkorokhod’s conjecture in 1997 and in 2014 Engelbert and Peskir [22] publisheda proof useful for further generalizations. The fact that the system has nostrong solution has consequences for how optimal control of the system can beapproached, as we will see in this paper.3uilding on [22], interacting particle systems of sticky reflected Brownianmotions are considered in [27]. Interaction is introduced via a Girsanov trans-formation. See [27, Sect. 3.2] for the construction. Under assumptions on the’shape’ of the interaction and integrability of the Girsanov kernel, the interact-ing system is well-defined. Since the process no longer behaves like a Brownianmotion in the interior of the domain, it is now referred to as a sticky reflectedSDE. The boundary behavior is shown to be sticky in the sense that the processspends a ( dt -)positive time on the boundary.Sticky reflected SDEs with boundary diffusion are considered in [28]. Thepaths defined by such a system are allowed to move on the (sufficiently smooth)boundary ∂ D of some bounded domain D ⊂ R d . Under smoothness conditionson ∂ D , the authors show that this type of SDE has a unique weak solution.Furthermore, an interacting system is studied, where interaction is introducedvia a Girsanov transformation. In this paper, the sticky reflected SDE with boundary diffusion of [28] is pro-posed as a model for pedestrian crowd motion in confined domains. We begin byconsidering a (non-transformed) sticky reflected SDE with boundary diffusionon D , a non-empty bounded subset of R n with C -smooth boundary Γ := ∂ D (see Section 2.2, below) and outward normal n , dX t = (1 D ( X t ) + 1 Γ ( X t ) π ( X t )) dB t − Γ ( X t ) 12 (cid:16) γ + κ ( X t ) (cid:17) n ( X t ) dt, (4)where π ( X t ) is the projection onto the tangent space of Γ at X t , κ ( X t ) isthe mean curvature of Γ at X t , and γ is a positive constant representing thestickiness of Γ, cf. Remark 1 in Section 3 below. All relevant technical detailscan be found in Section 2. Equation (4) admits a unique weak solution P , butno strong solution. To control an equation that admits only a weak solutionis to control a probability measure on (Ω , F ), under which the state process X · := { X t } t ∈ [0 ,T ] is interpreted as the coordinate process X t ( ω ) = ω ( t ). If allthe admissible distributions of X · are absolutely continuous with respect thereference measure P , then Girsanov’s theorem can be used to implement thecontrol. This corresponds to for the case when the drift of (4) is controlled. Inthe controlled diffusion case, admissible measures are all singular with P andwith one another (for different controls), and the control problem is in fact arobustness problem over all admissible measures which leads to the so-calledsecond order backward SDE framework [42]. In this paper we treat the casewith controlled drift, the controlled diffusion case will be treated elsewhere. Amean-field dependent drift β is introduced into the coordinate process throughthe Girsanov transformation d P u d P (cid:12)(cid:12)(cid:12) F t = L ut := E t (cid:18)(cid:90) · β ( t, X · , P u ( t ) , u t ) ∗ dB t (cid:19) , (5)4here P u ( t ) := P u ◦ X − t is the marginal distribution of X t under P u , β ∗ de-notes the transpose of β , and E is the Dol´eans-Dade exponential defined for acontinuous local martingale M as E t ( M ) := exp (cid:18) M t − (cid:104) M (cid:105) t (cid:19) . (6)The path of a typical pedestrian in the interacting crowd is then (under P u )described by dX t = 1 D ( X t ) (cid:16) β ( t, X · , P u ( t ) , u t ) dt + dB ut (cid:17) + 1 Γ ( X t ) (cid:18) π ( X t ) β ( t, X · , P u ( t ) , u t ) − n ( X t )2 γ (cid:19) dt + 1 Γ ( X t ) dB Γ ,ut ,dB Γ ,ut = π ( X t ) dB ut − κ ( X t ) n ( X t ) dt, (7)where B u is a P u -Brownian motion. We provide a proof of the existence of thecontrolled probability measure P u based on a fixed-point argument involvingthe total variation distance (cf. [20]).Pedestrians are assumed to be cooperating and controlled by a rational cen-tral planner. The central planner represents an authority that gives directionsto the crowd through signs, mobile devices, or security personnel, and the crowdfollows the instructions. This setup has been used to study evacuation in for ex-ample [11, 12, 21]. For a discussion on the goals, the degrees of cooperation, andthe information structure in a pedestrian crowd, see [18]. The central planner’sgoal is to minimize the finite-horizon cost functional J ( u ) := E u (cid:34)(cid:90) T f ( t, X · , P u ( t ) , u t ) dt + g ( X T , P u ( T )) (cid:35) , (8)where f is the instantaneous cost and g is the terminal cost (see Section 4 forconditions on the functions f and g ). The minimization of (8) subject to (7) isequivalent to the following mean-field type control problem, stated in the strongsense in the original probability space with measure P , inf u ∈U E (cid:34)(cid:90) T L ut f ( t, X · , P u ( t ) , u t ) dt + L uT g ( X T , P u ( T )) (cid:35) , s.t. dL ut = L ut β ( t, X · , P u ( t ) , u t ) ∗ dB t , L u = 1 . (9)The validity of (9) is justified in Section 4 below. Problem (9) is nowadays astandard mean-field type control problem and a stochastic maximum principleyielding necessary conditions for an optimal control can be found in [9]. Solvingthe general problem (9) with a Pontryagin-type maximum principle poses somepractical difficulties, the main one being the necessity of a second order adjoint5rocess. However, most difficulties can be tackled by imposing assumptionsplausible for the application in pedestrian crowd motion. With the aim toreplicate the pedestrian behavior observed in the empirical studies [49] and [50],we consider here a special case of (9) where u t takes values in a convex set and P u ( t ) is replaced by E u [ r ( X t )], where the function r : R d → R d can be differentfor each of the coefficients involved. The main contribution of this paper is a new approach to boundary conditionsin pedestrian crowd modeling. Sticky reflected SDEs of mean-field type withboundary diffusion is proposed as an alternative to reflected SDEs of mean-fieldtype to model pedestrian paths in optimal-control based models. Sticky bound-aries and boundary diffusion allow the pedestrian to spend time and move alongthe boundary (walls, pillars, etc.), in contrast to reflected SDE-based modelswhere pedestrians are immediately reflected. Existence and uniqueness of themean-field type version of the sticky reflected SDE with boundary diffusionis treated. The model can be optimally controlled (in the weak sense) and aPontryagin-type stochastic maximum principle is applied to derive necessaryoptimality conditions. Furthermore, the mean-field type control problem hasa microscopic interpretation in the form of a system of interacting sticky re-flected SDEs with boundary diffusion. The new features of sticky boundariesand boundary diffusion yield more flexibility when modeling pedestrian behav-ior at boundaries. A scenario of unidirectional pedestrian flow in a long narrowcorridor is studied numerically to highlight these novel characteristics and toreplicate experimental findings as a first step in model validation.The rest of the paper is organized as follows. Section 2 defines notation andsummarizes relevant background theory. Section 3 introduces sticky reflectedSDEs of mean-field type with boundary diffusion. Conditions under which theequation has a unique weak solution are presented. In Section 4 the finite horizonoptimal control of the state equation introduced in Section 3 is considered.In the uncontrolled case, the convergence on an interacting (non-mean-field)particle system to the sticky reflected SDE of mean-field type is proved. Finally,Section 5 presents analytic examples and numerical results based on the particlesystem approximation concerning unidirectional flow in a long narrow corridor.
The domain D is a non-empty bounded subset of R d with C -smooth boundaryΓ := ∂ D . The closure of D is denoted ¯ D . The Euclidean norm is denoted | · | . Afinite time horizon T > X · := { X t } t ∈ [0 ,T ] , and C is a generic positive constant.6 .1 The coordinate process and probability metrics Let ( X , d ) be a metric space. The set of Borel probability measures on X isdenoted by P ( X ). By P p ( X ) ⊂ P ( X ) we denote the set of all µ ∈ P ( X ) suchthat ( (cid:107) µ (cid:107) p ) p := (cid:82) d ( y , y ) p µ ( dy ) < ∞ for an arbitrary y ∈ X .Let Ω := C ([0 , T ]; R d ) be endowed with the metric | ω | T := sup t ∈ [0 ,T ] | ω ( t ) | for ω ∈ Ω. Denote by F the Borel σ -field over Ω. Given t ∈ [0 , T ] and ω ∈ Ω,put X t ( ω ) = ω ( t ) and denote by F t := σ ( X s ; s ≤ t ) the filtration generated by X · . X · is the so-called coordinate process . For any P ∈ P (Ω) (the set of Borelprobability measures on Ω) we denote by F P := ( F Pt ; t ∈ [0 , T ]) the completionof F := ( F t ; t ∈ [0 , T ]) with the P -null sets of Ω.Let µ, ν ∈ P ( R d ) and let B ( R d ) be the Borel σ -algebra on R d . The totalvariation metric on ( R d , B ( R d )) is d T V ( µ, ν ) := 2 sup A ∈B ( R d ) | µ ( A ) − ν ( A ) | . (10)On the filtration F P , where P ∈ P (Ω), the total variation metric between m, m (cid:48) ∈ P (Ω) is D t ( m, m (cid:48) ) := 2 sup A ∈F Pt | m ( A ) − m (cid:48) ( A ) | , ≤ t ≤ T, , (11)and satisfies D s ( m, m (cid:48) ) ≤ D t ( m, m (cid:48) ) for 0 ≤ s ≤ t . Consider the coordinateprocess X · , then for m, m (cid:48) ∈ P (Ω), d T V (cid:0) m ◦ X − t , m (cid:48) ◦ X − t (cid:1) ≤ D t ( m, m (cid:48) ) , ≤ t ≤ T. (12)Endowed with the metric D T , P (Ω) is a complete metric space. The totalvariation metric is connected to the Kullback-Leibler divergence through theCsisz´ar-Kullback-Pinsker inequality, D t ( m, m (cid:48) ) ≤ E m [log ( dm/dm (cid:48) )] , (13)where E m denotes expectation with respect to m . In this subsection we introduce the boundary diffusion B Γ and review the nec-essary parts of the background theory presented in [28, Sect. 2]. Definition 1. Γ is Lipschitz continuous (resp. C k -smooth) if for every x ∈ Γ there exists a neighborhood V ⊂ R d of x such that Γ ∩ V is the graph of aLipschitz continuous (resp. C k -smooth) function and D ∩ V is located on oneside of the graph, i.e., there exists new orthogonal coordinates ( y , . . . , y d ) givenby an orthogonal map T , a reference point z ∈ R d − , real numbers r, h > , anda Lipschitz continuous (resp. C k -smooth) function ϕ : R d − → R such that(i) V = { y ∈ R d : | y − d − z | < r, | y d − ϕ ( y − d ) | < h } ii) D ∩ V = { y ∈ V : − h < y d − ϕ ( y − d ) < } iii) Γ ∩ V = { y ∈ V : y d = ϕ ( y − d ) } Definition 2.
For y ∈ V , let (cid:101) n ( y ) := ( −∇ ϕ ( y − d ) , (cid:112) |∇ ϕ ( y − d ) | + 1 . (14) Let x ∈ Γ and T ∈ R d × d be the orthogonal transformation from Definition 1.Then the outward normal vector at x is defined by n ( x ) := T − (cid:101) n ( T x ) . Definition 3.
Let x ∈ Γ and π ( x ) := E − n ( x ) n ( x ) ∗ ∈ R d × d , where E is theidentity matrix. π ( x ) is the orthogonal projection on the tangent space at x . Note that for z ∈ R d , π ( x ) z = z − ( n ( x ) , z ) n ( x ). Definition 4.
Let f ∈ C ( ¯ D ) and x ∈ Γ . Whenever Γ is sufficiently smoothat x , ∇ Γ f ( x ) := π ( x ) ∇ f ( x ) and if f ∈ C ( ¯ D ) , ∆ Γ f ( x ) := T r ( ∇ f ( x )) . If n isdifferentiable at x the mean curvature of Γ at x is κ ( x ) := div Γ n ( x ) = ( π ( x ) ∇ ) · n ( x ) . (15)In [28] it is noted that whenever Γ is C -smooth,( π ∇ ) ∗ π = − κn. (16)A Brownian motion B Γ · on a smooth boundary Γ is a Γ-valued stochastic processgenerated by ∆ Γ . This is in analogy with the standard Brownian motion on R d , in the sense that B Γ · solves the martingale problem for ( ∆ Γ , C ∞ (Γ)). Asolution to the Stratonovich SDE dB Γ t = π ( B Γ t ) ◦ dB t , (17)where B · is a standard Brownian motion on R d , is a Brownian motion on Γ [32,Chap. 3, Sect. 2]. By the Itˆo-Stratonovich transformation rule, the Brownianmotion on Γ solves dB Γ t = − κ ( B Γ t ) n ( B Γ t ) dt + π ( B Γ t ) dB t . (18) In this section we provide conditions for the existence and uniqueness of a weaksolution to the sticky reflected SDE of mean-field type with boundary diffusion.Consider the reflected sticky SDE with boundary diffusion, dX t = − Γ ( X t ) 12 (cid:18) γ + κ ( X t ) (cid:19) n ( X t ) dt + (1 D ( X t ) + 1 Γ ( X t ) π ( X t )) dB t ,X = x ∈ ¯ D , (19)8hich from now on will be written in short-hand notation as dX t = a ( X t ) dt + σ ( X t ) dB t , (20)where a : [0 , T ] × R d → R d and σ : [0 , T ] × R d → R d × d are bounded functionsover [0 , T ] × ¯ D , defined as a ( x ) := − Γ ( x ) 12 (cid:18) γ + κ ( x ) (cid:19) n ( x ) , σ ( x ) := 1 D ( x ) + 1 Γ ( x ) π ( x ) . (21)By [28, Thm 3.9 & 3.17], (19) has a unique weak solution, i.e. there is aunique probability measure P on (Ω , F ) that solves the corresponding martingaleproblem (cf. [35, Thm 18.7]), and the solution X · is C ([0 , T ]; ¯ D )-valued P -a.s.The result [28, Thm 3.9] relies on some conditions, lets verify them for the sakeof completeness. The weight functions α and β , introduced on [28, pp. 6], arein (19) set to be everywhere constant and positive such that α/β = 1 /γ (cf.Remark 1, below). Condition 3.12 of [28] therefore holds: ∂ D is C and theconstant positive weight functions have the required regularity. This justifiesthe use of [28, Thm 3.9], and no further conditions are required for [28, Thm3.17]. To simplify notation, from now on through out the rest of this paper let F denote the completion of F with the P -null sets of Ω, i.e. F = ( F t ; t ≥
0) := F P . Remark 1.
The coordinate process is composed of three essential parts: • Interior diffusion D ( X t ) dB t ; • Boundary diffusion Γ ( X t )( π ( X t ) dB t − ( κn )( X t ) dt ) = 1 Γ ( X t ) dB Γ t ; • Normal sticky reflection − Γ ( X t ) γ n ( X t ) dt .The constant γ is connected to the level of stickiness of the boundary Γ . Itis related to the invariant distribution of the coordinate processes’ R d -valuedtime marginal. Let λ and s denote the Lebesgue measure on R d and the surfacemeasure on Γ , respectively. Consider the measure ρ := 1 D αλ +1 Γ α (cid:48) s , α, α (cid:48) ∈ R .By choosing α = ¯ α/λ ( D ) and α (cid:48) = (1 − ¯ α ) /s (Γ) , ¯ α ∈ [0 , , ρ becomes aprobability measure on R d with support in ¯ D and ρ is in fact the invariantdistribution of (19) whenever γ = ¯ α (1 − ¯ α ) s (Γ) λ ( D ) . (22) Hence ¯ α → as γ → and the invariant distribution of (19) concentrates onthe interior D . But as γ → ∞ , it concentrates on the boundary Γ . We say thatthe more probability mass that ρ locates on Γ , the stickier Γ is. Next, we introduce mean-field interactions and a control process in (19)through a Girsanov transformation.
Definition 5.
Let the set of control values U be a subset of R d . The set ofadmissible controls is U := { u : [0 , T ] × Ω → U : u F -prog. measurable } . (23)9et Q ( t ) := Q ◦ X − t denote the t -marginal distribution of the coordinateprocess under Q ∈ P (Ω). Let β be a measurable function from [0 , T ] × Ω ×P ( R d ) × U into R d such that Assumption 1.
For every Q ∈ P (Ω) and u ∈ U , ( β ( t, X · , Q ( t ) , u t )) t ∈ [0 ,T ] is progressively measurable with respect to F , the completion of the filtrationgenerated by the coordinate process with the P -null sets of Ω . Assumption 2.
For every t ∈ [0 , T ] , ω ∈ Ω , u ∈ U , and µ ∈ P ( R d ) , | β ( t, ω, µ, u ) | ≤ C (cid:18) | ω | T + (cid:90) R d | y | µ ( dy ) (cid:19) . (24) Assumption 3.
For every t ∈ [0 , T ] , ω ∈ Ω , u ∈ U , and µ, µ (cid:48) ∈ P ( R d ) , | β ( t, ω, µ, u ) − β ( t, ω, µ (cid:48) , u ) | ≤ Cd T V ( µ, µ (cid:48) ) . (25)Given Q ∈ P (Ω) and u ∈ U , let L u, Q t := E t (cid:18)(cid:90) · β ( s, X · , Q ( s ) , u s ) dB s (cid:19) , (26)where E is the Dol´eans-Dade exponential (cf. (6)). Lemma 1.
The positive measure P u, Q defined by d P u, Q = L u, Q t d P on F t forall t ∈ [0 , T ] , is well-defined and is a probability measure on Ω . Moreover, P u, Q ∈ P p (Ω) for all p ∈ [1 , ∞ ) and under P u, Q the coordinate process satisfies X t = x + (cid:90) t (cid:16) σ ( X s ) β ( s, X · , Q ( s ) , u s ) + a ( X s ) (cid:17) ds + (cid:90) t σ ( X s ) dB Q s , (27) where B Q is a standard P u, Q -Brownian motion.Proof. Assume that ϕ · is a process such that P ϕ , defined by d P ϕ = L ϕt d P on F t where L ϕt := E t ( (cid:82) · ϕ s dB s ), is a probability measure on Ω. By Girsanov’stheorem, the coordinate process under P ϕ satisfies dX t = ( σ ( X t ) ϕ t + a ( X t )) dt + σ ( X t ) dB ϕt , (28)where B ϕ · is a P ϕ -Brownian motion. C -smoothness of the boundary Γ grantsa bounded orthogonal projection on Γ’s tangent space and a bounded meancurvature of Γ. By the Burkholder-Davis-Gundy inequality we have for 1 ≤ p < ∞ E ϕ [ | X | pT ] ≤ E ϕ (cid:34) C (cid:32) | X | p + (cid:90) T | σ ( X s ) ϕ s | p ds + (cid:90) T | a ( X s ) | p ds + (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) · σ ( X s ) dB ϕs (cid:12)(cid:12)(cid:12)(cid:12) pT (cid:33)(cid:35) ≤ C (cid:32) (cid:90) T E ϕ [ | ϕ s | p ] ds (cid:33) , (29)10here E ϕ denotes expectation taken under P ϕ . By Assumption 3 it holds forevery t ∈ [0 , T ], ω ∈ Ω, µ ∈ P ( R d ), and u ∈ U that | β ( t, ω, µ, u ) | ≤ C (cid:16) d T V ( µ, P ( t )) + | β ( t, ω, P ( t ) , u ) | (cid:17) . (30)In view of (30), Assumption 2 and 3, and the fact that the total variationbetween two probability measures is uniformly bounded, we have for all t ∈ [0 , T ], | β ( t, X · , Q ( t ) , u t ) | ≤ C ( d T V ( Q ( t ) , P ( t )) + | β ( t, X · , P ( t ) , u t ) | ) ≤ C (cid:18) | X | T + (cid:90) R d | y | P ( t )( dy ) (cid:19) ≤ C (cid:0) sup {| y | : y ∈ ¯ D} (cid:1) =: ¯ C < ∞ , P -a.s. (31)The third inequality of (31) holds P -a.s. since under P , X · ∈ C (cid:0) [0 , T ]; ¯ D (cid:1) almostsurely. We note that (31) implies that Novikov’s condition is satisfied, E (cid:34) exp (cid:32) (cid:90) T sup s ∈ [0 ,T ] | β ( s, X · , Q ( s ) , u s ) | dt (cid:33)(cid:35) ≤ E (cid:20) exp (cid:18) T ¯ C (cid:19)(cid:21) < ∞ , (32)where E denotes expectation with respect to P . Hence the Dol´eans-Dade expo-nential defined in (26) is an ( F t , P )-martingale and P u, Q is indeed a probabilitymeasure, i.e. P u, Q ∈ P (Ω). To show that P u, Q ∈ P p (Ω) for any p ∈ [1 , ∞ ), wesimply note that E u, Q [ | X | pT ]= E u, Q (cid:2) | X | pT (cid:0) { X · ∈ C ([0 ,T ]; ¯ D ) } + 1 { X · / ∈ C ([0 ,T ]; ¯ D ) } (cid:1)(cid:3) = E (cid:104) L u, Q T | X | pT (cid:0) { X · ∈ C ([0 ,T ]; ¯ D ) } + 1 { X · / ∈ C ([0 ,T ]; ¯ D ) } (cid:1)(cid:105) ≤ sup {| y | p : y ∈ ¯ D} E (cid:104) L u, Q T { X · ∈ C ([0 ,T ]; ¯ D ) } (cid:105) = sup {| y | p : y ∈ ¯ D} . (33)Finally, by Girsanov’s theorem the coordinate process under P u, Q satisfies (27).For a given u ∈ U , consider the mapΦ u : P (Ω) (cid:51) Q (cid:55)→ P u, Q ∈ P (Ω) , (34)such that d P u, Q = L u, Q t d P on F t , where L u, Q is given by (26). Proposition 1.
The map Φ u is well-defined and admits a unique fixed point forall u ∈ U . Moreover, for every p ∈ [1 , ∞ ) the fixed point, denoted P u , belongsto P p (Ω) . In particular, E u [ | X | pT ] ≤ sup y ∈ ¯ D | y | p , (35) where E u denotes expectation with respect to P u . roof. By Lemma 1, the mapping is well defined. We first show the contractionproperty of the map Φ u in the complete metric space P (Ω), endowed with thetotal variation distance D T . The proof is an adaptation of the proof of [16,Thm. 8]. For each t ∈ [0 , T ], let β Q t := β ( t, X · , Q ( t ) , u t ). Given Q , (cid:101) Q ∈ P (Ω),the Csisz´ar-Kullback-Pinsker inequality (13) and the fact that (cid:82) · ( dB s − β Q s ds )is a martingale under Φ u ( Q ) = P u, Q yields D T (cid:16) Φ u ( Q ) , Φ u ( (cid:101) Q ) (cid:17) ≤ E u, Q (cid:104) log (cid:16) L u, Q T /L u, (cid:101) Q T (cid:17)(cid:105) = 2 E u, Q (cid:34)(cid:90) T (cid:16) β Q s − β (cid:101) Q s (cid:17) dB s − (cid:90) T (cid:16) β Q s (cid:17) − (cid:16) β (cid:101) Q s (cid:17) ds (cid:35) = 2 E u, Q (cid:34)(cid:90) T (cid:16) β Q s − β (cid:101) Q s (cid:17) β Q s − (cid:16) β Q s (cid:17) + 12 (cid:16) β (cid:101) Q s (cid:17) ds (cid:35) = (cid:90) T E u, Q (cid:20)(cid:16) β Q s − β (cid:101) Q s (cid:17) (cid:21) ds ≤ C (cid:90) T d T V (cid:16) Q ( s ) , (cid:101) Q ( s ) (cid:17) ds ≤ C (cid:90) T D s (cid:16) Q , (cid:101) Q (cid:17) ds. (36)Iterating the inequality, we obtain for every N ∈ N , D T (cid:16) Φ Nu ( Q ) , Φ Nu ( (cid:101) Q ) (cid:17) ≤ C N T N N ! D T (cid:16) Q , (cid:101) Q (cid:17) , (37)where Φ Nu denotes the N -fold composition of Φ u . Hence Φ Nu is a contraction for N large enough, thus admitting a unique fixed point, which is also the uniquefixed point for Φ u . Under P u , the fixed point of Φ u , the coordinate processsatisfies dX t = ( σ ( X t ) β ( t, X · , P u ( t ) , u t ) + a ( X t )) dt + σ ( X t ) dB ut , (38)where B u is a P u -Brownian motion. Following the calculations from Lemma 1that lead to (33), we get the estimate( (cid:107) P u (cid:107) p ) p = E u [ | X | pT ] ≤ sup y ∈ ¯ D | y | p , (39)where p ∈ [1 , ∞ ).From now on, we will denote the Brownian motion corresponding to P u by B u . To summarize this section, we have proved the following result underAssumption 1-3. Theorem 1.
Given u ∈ U , there exists a unique weak solution to the stickyreflected SDE of mean-field type with boundary diffusion dX t = ( σ ( X t ) β ( t, X · , P u ( t ) , u t ) + a ( X t )) dt + σ ( X t ) dB ut . (40) Under P u the t -marginal distribution of X · is P u ( t ) for t ∈ [0 , T ] and X · isalmost surely C ([0 , T ]; ¯ D ) -valued. Furthermore, P u ∈ P p (Ω) . roof. We are left to show that P u (cid:0) X · ∈ C ([0 , T ]; ¯ D ) (cid:1) = 1, all other statementsof the theorem have been proved. Let L uT := E T ( (cid:82) · β ( s, X · , P u ( s ) , u s ) dB s ).Since P ( X · / ∈ C ([0 , T ]; ¯ D ) = 0, P u (cid:0) X · / ∈ C ([0 , T ]; ¯ D ) (cid:1) = E (cid:2) L uT { X · / ∈ C ([0 ,T ]; ¯ D ) } (cid:3) = 0 , (41)which proves that X · is P u -almost surely C ([0 , T ]; ¯ D )-valued. Remark 2.
The drift component β is projected in the tangential direction ofthe boundary by σ whenever the process is at the boundary (cf. (19) ). The driftcomponent a is not effected by the transformation. From a modeling perspec-tive, the interpretation is that the pedestrian’s tangential movement is partiallycontrollable but also influenced by other pedestrians through the mean field. Thenormal direction is an uncontrolled delayed reflection. Let E u denote expectation taken under P u . To apply the stochastic maximumprinciple of [8], we make the assumption that the mean-field type Girsanovkernel β depends linearly on P u . Assumption 4.
Let (cid:101) β : [0 , T ] × Ω × R d × U → R d and let r β : R d → R d , andassume that β ( t, X · , P u ( t ) , u t ) = (cid:101) β ( t, X · , E u [ r β ( X t )] , u t ) . (42)With some abuse of notation, we will continue to denote the Girsanov kernelby β , although from now this refers to (cid:101) β . Let f : [0 , T ] × Ω × R d × U → R , g : R d × R d → R , r f : R d → R d , and r g : R d → R d . Assumption 5.
For every u ∈ U , the process ( f ( t, X · , E u [ r f ( X t )] , u t )) t is pro-gressively measurable with respect to F and ( x, y ) (cid:55)→ g ( x, y ) is Borel measurable. Consider the finite horizon mean-field type cost functional J : U → R , J ( u ) := E u (cid:34)(cid:90) T f ( t, X · , E u [ r f ( X t )] , u t ) dt + g ( X T , E u [ r g ( X T )]) (cid:35) . (43)The control problem considered in this section is the minimization of J withrespect to u ∈ U under the constraint that the coordinate process for any given u satisfies (40). The integration in (43) is with respect to a measure absolutelycontinuous with respect to P . Changing measure, we get J ( u ) = E (cid:20) (cid:90) T L ut f ( t, X · , E [ L ut r f ( X t )] , u t ) dt + L uT g ( X T , E [ L uT r g ( X T )]) (cid:21) , (44)where E is the expectation taken under the original probability measure P and L u the controlled likelihood process, given by the SDE of mean-field type dL ut = L ut β ( t, X · , E [ L ut r β ( X t )] , u t ) ∗ dB t , L u = 1 . (45)13 .1 Necessary optimality conditions After making one final assumption about the regularity of β, f , and g (Assump-tion 6 below), the stochastic maximum principle yields necessary conditions onan optimal control for the minimization of (44) subject to (45). Assumption 4and 6 are stated in their current form for the sake of technical, not conceptual,simplicity and may be relaxed. Assumption 6.
The functions ( t, x, y, u ) (cid:55)→ ( f, β )( t, x, y, u ) and ( x, y ) (cid:55)→ g ( x, y ) are twice continuously differentiable with respect to y . Moreover, β, f and g andall their derivatives up to second order with respect to y are continuous in ( y, u ) ,and bounded. The next result is a slight generalization of [8, Thm 2.1]. The paper [8] treatsan optimal control problem of mean-field type with deterministic coefficients.The approach of [8], which goes back to [41], extends without any further condi-tions to include random coefficients, as shown in [31]. Moreover, in our case thecoefficients are not bounded functions, they are linear in the likelihood. Thisseems to violate the conditions of [8, Thm 2.1] but an application of Gr¨onwall’slemma yields E [( L ut ) p ] ≤ exp( C ( p ) t ) for all t ∈ [0 , T ] and p ≥
2, where C ( p ) is abounded constant, and the estimates of [8] can be recovered after an applicationof H¨older’s inequality. Theorem 2.
Assume that (ˆ u, L ˆ u ) solves the optimal control problem (44) - (45) .Then there are two pairs of F -adapted processes, ( p, q ) and ( P, Q ) , that satisfythe first and second order adjoint equations dp t = − (cid:16) q t β ˆ ut + E (cid:2) q t L ˆ ut ∇ y β ˆ ut (cid:3) r β ( X t ) − f ˆ ut − E (cid:2) L ˆ ut ∇ y f ˆ ut (cid:3) r f ( X t ) (cid:17) dt + q t dB t ,p T = − g ˆ uT − E (cid:2) L ˆ uT ∇ y g ˆ uT (cid:3) r g ( X T ) , (46) dP t = − (cid:16) (cid:12)(cid:12) β ˆ ut + E (cid:2) L ˆ ut ∇ y β ˆ ut (cid:3) r β ( X t ) (cid:12)(cid:12) P t + 2 Q t (cid:0) β ˆ ut + E (cid:2) L ˆ ut ∇ y β ˆ ut (cid:3) r β ( X t ) (cid:1) (cid:17) dt + Q t dB t ,P T = 0 , (47) where ∇ y denotes differentiation with respect to the R d -valued argument. Fur-thermore, ( p, q ) and ( P, Q ) satisfy E (cid:34) sup t ∈ [0 ,T ] | p t | + (cid:90) T | q t | dt (cid:35) < ∞ , E (cid:34) sup t ∈ [0 ,T ] | P t | + (cid:90) T | Q t | dt (cid:35) < ∞ , (48) and for every u ∈ U and a.e. t ∈ [0 , T ] , it holds P -a.s. that H (cid:0) L ˆ ut , u, p t , q t (cid:1) − H (cid:0) L ˆ ut , ˆ u t , p t , q t (cid:1) + 12 [ δ ( Lβ ) ( t )] T P t [ δ ( Lβ ) ( t )] ≤ , (49)14 here H ( L ut , u t , p t , q t ):= L ut β ut q t − L ut f ut and δ ( Lβ )( t ) := L ˆ ut (cid:0) β (cid:0) t, X · , E [ L ˆ ut r β ( X t )] , u (cid:1) − β ˆ ut (cid:1) . (50)The following local form of the optimality condition (49) can be found ine.g. [48, pp. 120], and will be useful for computation in Section 5. If U is aconvex set and H is differentiable with respect to u , then (49) implies( u − ˆ u t ) ∗ ∇ u H (cid:0) L ˆ ut , ˆ u t , p t , q t (cid:1) ≤ , ∀ u ∈ U, a.e. t ∈ [0 , T ] , P -a.s . (51) Remark 3.
Sufficient conditions for weak optimal controls will seldom be sat-isfied since they typically require the Hamiltonian to be convex (or concave) inat least state ( L ut ) and control ( u t ). This is false even for the simplest ver-sion of our problem. Assume that β ( t, ω, y, u ) = u and f = 0 , then ( (cid:96), u ) (cid:55)→H ( (cid:96), u, p, q ) = (cid:96)uq , which is neither convex nor concave. However, necessaryoptimality conditions can be useful as we will see in Section 5. In this section, we give a microscopic interpretation of the mean-field type con-trol problem (9) in the form of an interacting particle system (collaboratively)minimizing the social cost. Our means will be the propagation of chaos result[39, Thm. 2.6]. We will work under all the assumptions stated so far, but wewill use the notation from Section 3 for β , f , and g .We will fix a closed-loop control and we will assume that all the interactingparticles are using this control. This assumption is made in order to extractthe approximating property of any solution to the mean-field optimal controlproblem that is on closed-loop form. In Section 5, we will see examples of suchcontrols.We introduce an interacting system of sticky reflected SDEs with boundarydiffusion. Each equation has an initial value with distribution λ , where λ is anonatomic measure and λ ( ¯ D ) = 1. See Remark 10 in [39] for the necessity ofthe random initial condition.Consider the measure P ⊗ N on (Ω N , B (Ω N )), the weak solution to a systemof N ∈ N i.i.d. sticky reflected Brownian motions with boundary diffusion dX N,it = a ( X N,it ) dt + σ ( X N,it ) dB it , X N,i = ξ N,i , i = 1 , . . . , N, (52)where ξ , . . . , ξ N are i.i.d. random variables with law λ which has support onlyon ¯ D , and such that B , . . . , B N are independent F -Wiener processes. The func-tions a and σ are defined as in (20). Given controls u i ∈ U (now F -progressivelymeasurable), i = 1 , , . . . , define the likelihood process L N,i u ,t as the solution to dL N,i u ,t = L N,i u ,t β (cid:0) t, X N,i · , µ Nt , u it (cid:1) ∗ dB it , L N,i u , = 1 , i = 1 , . . . , N, (53)15here µ N is the empirical measure of the coordinate processes, µ N := 1 N N (cid:88) i =1 δ X i · ∈ P (Ω) . Then L N u ,t := (cid:81) Ni =1 L N,i u ,t is the Radon-Nikodym derivative for the Girsanov-typechange of measure from P ⊗ N to P N, u , under which the coordinate processessatisfy dX N,it = (cid:16) a ( X N,it ) + σ ( X N,it ) β (cid:0) t, X N,i · , µ Nt , u it (cid:1)(cid:17) dt + σ ( X N,it ) d (cid:101) B it ,X N,i = ξ N,i , i = 1 , . . . , N, (54)where (cid:101) B , . . . are P N, u -Brownian motions and u := ( u , . . . , u N ). We note that P N, u is the law of a system of interacting diffusion processes. The social cost ofthe system (54) is defined as1 N N (cid:88) i =1 J i ( u ) := 1 N N (cid:88) i =1 E N, u (cid:34)(cid:90) T f ( t, X N,i · , µ Nt , u it ) dt + g ( X iT , µ NT ) (cid:35) . (55)The following theorem is an adaptation of [39, Thm. 2.6] where the drift b := a + σβ and the Girsanov kernel σ − b := β . Theorem 3.
Let u ∈ U be a closed-loop control, i.e. u t ( ω ) = ϕ ( ω ·∧ t ) for somemeasurable function ϕ : (Ω , F ) → ( U, B ( U )) . Given the control u and a randomvariable ξ with law λ (nonatomic with support only on ¯ D ), the sticky reflectedSDE of mean-field type with boundary diffusion (cid:40) dX t = ( a ( X t ) + σ ( X t ) β ( t, X · , P u ( t ) , ϕ ( X ·∧ t ))) dt + σ ( X t ) dB t ,X = ξ, (56) can be approximated by the interacting particle system (54) with all componentsusing the fixed closed-loop control u . Furthermore, the value of the mean-fieldcost functional J at u is the asymptotic social cost of the interacting particlesystem as N → ∞ when all the X N,i s are using the fixed control u . Morespecifically, lim N →∞ D T (cid:0) P N, u ◦ ( X N, · , . . . , X N,k · ) − , ( P u ◦ X − · ) ⊗ k (cid:1) = 0 , (57) with u = ( u, . . . , u ) , and lim N →∞ N N (cid:88) i =1 J i ( u, . . . , u ) → J ( u ) . (58) Proof.
We denote by E ( P (Ω)) the smallest σ -field on P (Ω) such that the map µ (cid:55)→ (cid:82) Ω φdµ is measurable for all bounded and measurable φ : Ω → R . As16ointed out in [39], E ( P (Ω)) coincides with the Borel σ -field on P (Ω) generatedby the topology of weak convergence.To verify the assumptions of [39, Thm. 2.6], we note that β is progressivelymeasurable with respect to F and that β is Lipschitz continuous in the measure-valued argument with respect to d T V . This implies condition ( E ) in [39], the E ( P (Ω))-measurability of the function F s,t : P (Ω) → R ,F s,t ( ν ) = (cid:90) Ω (cid:90) ts | β ( u, ω, ν t ) − β ( u, ω, P u ( t )) | du ν ( dω ) , (59)the τ (Ω)-continuity of F s,t , and the inequality (2.3) from [39, Thm. 2.6]. Fur-thermore, β is bounded, implying condition (A) in [39]. So the propagation ofchaos (57) holds.By [43, Prop. 2.2], the propagation of chaos implies that P ( P (Ω)) (cid:51) M N := P N, u ◦ ( µ N ) − → δ P u ◦ X − · in the weak topology. By assumption, f and g arebounded and continuous in the y -argument. Hence,lim N →∞ N N (cid:88) i =1 J i ( u, . . . , u )= lim N →∞ N N (cid:88) i =1 E N, u (cid:34)(cid:90) T f (cid:16) t, X N,i , µ Nt , ϕ ( X N,i ·∧ t ) (cid:17) dt + g ( X N,iT , µ NT ) (cid:35) = lim N →∞ E N, u (cid:34) (cid:90) T (cid:90) Ω f (cid:0) t, ω (cid:48) , µ Nt , ϕ ( ω (cid:48)·∧ t ) (cid:1) µ N ( dω (cid:48) ) dt + (cid:90) Ω g ( ω (cid:48) ( T ) , µ NT ) µ N ( dω (cid:48) ) (cid:35) = lim N →∞ (cid:90) T (cid:90) P (Ω) (cid:26)(cid:90) Ω f (cid:18) t, ω (cid:48) , (cid:90) Ω r f ( ω (cid:48)(cid:48) ( t )) m ( dω (cid:48)(cid:48) ) , ϕ ( ω (cid:48)·∧ t ) (cid:19) m ( dω (cid:48) ) (cid:27) M N ( dm ) dt + lim N →∞ (cid:90) P (Ω) (cid:90) Ω g (cid:18) ω (cid:48) ( T ) , (cid:90) Ω r g ( ω (cid:48)(cid:48) ( T )) m ( dω (cid:48)(cid:48) ) (cid:19) m ( dω (cid:48) ) M N ( dm ) . = E u (cid:34)(cid:90) T f ( t, X · , P u ( t )) dt + g ( X T , P u ( T )) (cid:35) = J ( u ) . (60) As a first step in model validation, experimental results on pedestrian speedprofiles in a long narrow corridor are replicated in this section. The applica-tion of the proposed approach also displays the new features it offers regarding17ehavior near walls. From the necessary optimality conditions we derive anexpression for the optimal control valid in following two toy examples and thecorridor scenario. The numerical simulations are based on the particle systemapproximation derived in Section 4.2.Throughout the rest of this section it is assumed that the compact set U is convex and sufficiently large so that all optimal control in the following an-alytical expressions are admissible. Furthermore, it is assumed that r g is dif-ferentiable and that (ˆ u, L ˆ u ) is optimal for the mean-field type control problem(44)-(45). We recall the first order adjoint equation, dp t = − (cid:16) q t β ˆ ut + E (cid:2) q t L ˆ ut ∇ y β ˆ ut (cid:3) r β ( X t ) − f ˆ ut − E (cid:2) L ˆ ut ∇ y f ˆ ut (cid:3) r f ( X t ) (cid:17) dt + q t dB t ,p T = − g ˆ uT − E (cid:2) L ˆ uT ∇ y g ˆ uT (cid:3) r g ( X T ) . (61)Rewriting E [ L ˆ ut Y t ] = E ˆ u [ Y t ] and changing measure to P ˆ u , (61) becomes (cid:40) dp t = − A t dt + q t dB ˆ ut ,p T = − g ˆ uT − E ˆ u (cid:2) ∇ y g ˆ uT (cid:3) r g ( X T ) , (62)where A t := E ˆ u (cid:2) q t ∇ y β ˆ ut (cid:3) r β ( X t ) − f ˆ ut − E ˆ u (cid:2) ∇ y f ˆ ut (cid:3) r f ( X t ) . By the martingalerepresentation theorem (see e.g. [36, pp. 182]) p can be written as the condi-tional expectation p t = − E ˆ u (cid:2) g ˆ uT + E ˆ u [ ∇ y g ˆ uT ] r g ( X T ) | F t (cid:3) + E ˆ u (cid:34)(cid:90) Tt A s ds | F t (cid:35) . (63)The theorem applies to our problem since g and its y -derivative are assumed tobe bounded. Let φ ( t, X t ) := g (cid:0) X t , E ˆ u [ r g ( X t )] (cid:1) + E ˆ u [ ∇ y g ˆ ut ] r g ( X t ) . (64)By Dynkin’s formula, E ˆ u [ φ ( T, X T ) | F t ] = φ ( t, X t ) + (cid:90) Tt E ˆ u [( G + ∂ s ) φ ( s, X s ) | F t ] ds, (65)where G is the generator of the coordinate process and ∂ s denotes differentiationwith respect to time, working on the two remaining arguments of φ . Hence, byapplying Itˆo’s formula on p in (63), where only X · contributes to the diffusionpart, and matching the diffusion parts of that and p from (62), we get q s = −∇ x φ ( s, X s ) σ ( X s ) . (66)The local optimality condition in the case of a convex U and coefficients differ-entiable in u , given in (51) right below Theorem 2, can be used to write ˆ u interms of the other processes. To use it, we make the following assumption.18 ssumption 7. The functions ( t, x, y, u ) (cid:55)→ ( f, β )( t, x, y, u ) are differentiablewith respect to u . With Assumption 7 in force, an optimal control ˆ u satisfies the the local opti-mality condition. The local optimality condition is satisfied by any ˆ u such that ∇ u H ( L ˆ ut , ˆ u t , p t , q t ) = 0 for almost every t ∈ [0 , T ], P -a.s., i.e. q t ∇ u β ˆ ut = ∇ u f ˆ ut , a.e. t ∈ [0 , T ] , P -a.s. . (67)Since P ˆ u is absolutely continuous with respect to P , the equality above alsoholds for almost every t ∈ [0 , T ] P ˆ u -a.s. We have now at hand an expression forthe optimal control whenever we can solve (66)-(67) for ˆ u . U Let
D ⊂ R d be an admissible domain and P the probability measure on the spaceof continuous paths under which the coordinate process solves (19). Considerthe following linear-quadratic optimal control problem on D , min u ∈U E (cid:34)(cid:90) T L ut | u t | dt + L uT | X T − x T | (cid:35) , s.t. dL ut = L ut u ∗ t dB t , L u = 1 , where B is a P -Brownian motion. The necessary optimality condition (67) yieldsˆ u t = q ∗ t , P -a.s. , a.e. t ∈ [0 , T ] . (68)Matching the diffusion coefficients gives us the optimal control,ˆ u t = − σ ( X t ) ( X t − x T ) , P -a.s. , a.e. t ∈ [0 , T ] . (69)The corresponding likelihood process solves dL ˆ ut = − L ˆ ut ( X t − x T ) ∗ σ ( X t ) dB t , L ˆ u = 1 , and under P ˆ u , the optimally controlled path distribution, the coordinate processsolves dX t = a ( X t ) dt + σ ( X t ) dB t = a ( X t ) dt + σ ( X t ) (cid:0) − σ ( X t ) ( X t − x T ) dt + dB ˆ ut (cid:1) = ( a ( X t ) − σ ( X t ) ( X t − x T )) dt + σ ( X t ) dB ˆ ut . (70)We have used the fact that π = π = π ∗ , which holds since π is an orthogonalprojection. 19 .1.2 A mean-field example Consider now on some admissible domain
D ⊂ R d the mean-field type optimalcontrol problem min u ∈U E (cid:34)(cid:90) T L ut | u t | dt + L uT | X T − E [ L uT X T ] | (cid:35) , s.t. dL ut = L ut u ∗ t dB t , L u = 1 . As before, B is a P -Brownian motion, where P is a probability measure on thepath space under which the coordinate process solves (19). Then E ˆ u [ ∇ y g ˆ ut ] = 0,so (since r g ( x ) = x here) ∇ x φ ( t, X t ) = (cid:0) X t − E ˆ u [ X t ] (cid:1) ∗ , and (67) yields ˆ u t = − σ ( X t )( X t − E ˆ u [ X t ]) P -a.s. for almost every t ∈ [0 , T ].Under P ˆ u the coordinate process solves dX t = (cid:0) a ( X t ) − σ ( X t ) (cid:0) X t − E ˆ u [ X t ] (cid:1)(cid:1) dt + σ ( X t ) dB ˆ ut . Experimental studies have been conducted on the impact of proximity to wallson pedestrian speed. Pedestrian speed profiles heavily depend on circumstanceslike location, weather, and congestion. In this section, we will replicate two sce-narios of unidirectional motion in a confined domain with the proposed mean-field type optimal control model. Especially, we are interested in how the pro-posed model behaves on the boundary and if boundary movement characteristicscan be influenced through the running cost f . Sticky boundaries and bound-ary diffusion grants our pedestrians controlled movement at the boundary. Byaltering the internal parameters of these effect, we are able to shape the meanspeed profile at the boundary.Zanlungo et al. [49] observe that in a tunnel connecting a shopping cen-ter with a railway station in Osaka, Japan, pedestrians tend to lower theirwalking speed when walking close to the walls. The authors obtain a concavecross-section average speed profile from their experiment, with its maximumapproximately at the center of the corridor. The average speed at the center ofthe corridor is about 10% higher than that of near-wall walkers.Daamen and Hoogendoorn [19] on the other hand observe (in a controlledenvironment) pedestrian speeds that are higher at the boundary than in the in-terior of the domain. In their experiment, a unidirectional stream of pedestrianswalk in a wide corridor that at a certain point, at a bottleneck , shrinks into atight corridor. Upstream from the bottleneck, pedestrians close to the corridorwalls move more freely due to less congestion, compared to those at the center ofthe corridor. The experiment results in a cross-section speed profile with morethan twice as high average pedestrian speed in the low-density regions alongcorridor walls compared to the center of the corridor.20y modeling congestion with simple mean-dependent effects, we can repli-cate the overall shape of the average speed profiles of both [49] and [19] (notthe density profile, to achieve this one needs a more sophisticated mean-fieldmodel). Our reason for implementing only mean-dependent effects, not of non-local distribution-dependent effects like those considered in for example [4], issolely to simplify the analysis.Consider a long narrow corridor with walls parallel to the x -axis at y = − . y = 0 .
1. Our analysis requires D to be C -smooth, so the effective corridor(the corridor perceived by the pedestrians) has rounded corners. However, thecorners will not have any substantial effect on the simulation results since thecrowd is initiated so far away from the target that under the chosen coefficientvalues, the pedestrians will not reach it ahead of the time horizon T = 1. On thisdomain, crowd behavior is modeled with the following optimal control problem min u · ∈U E (cid:20)(cid:90) L ut f ( t, X · , E [ L ut r f ( X t )] , u t ) dt + L uT | X T − x T | (cid:21) , s.t. dL ut = L ut u t dB t , L u = 1 , (71)where B is a Brownian motion under P , the probability measure under which X · solves (19) with γ = 0 .
5, and x T is the location of an exit at the end ofthe corridor. The choice of γ is made so that the plots below are visuallycomparable. The running cost f is of congestion-type, f ( t, X · , E [ L ut r f ( X t )] , u t ) = C ( X t ) (cid:16) c f + h ( t, X · , E u [ r f ( X t )]) (cid:17) u t , where c f u , c f >
0, is the cost of moving in free space, and hu the additionalcost to move in congested areas. The coefficient C ( X t ) := c Γ Γ ( X t ) + 1 D ( X t ), c Γ >
0, is used to monitor f (though it is not our control process) on theboundary Γ. The cost of moving on the boundary is increasing with c Γ , so forhigh c Γ we expect lower speed on the boundary. We know from (67)-(66) that q ∗ t = C ( X t ) (cid:16) c f + h (cid:0) t, X · , E ˆ u [ r f ( X t )] (cid:1) (cid:17) ˆ u t , q t = − ( X t − x T ) ∗ σ ( X t ) . (72)Matching the expressions in (72) yields the optimal controlˆ u t = σ ( X t ) ( X t − x T ) C ( X t ) (cid:16) c f + h ( t, X · , E ˆ u [ r f ( X t )]) (cid:17) . It implements the following strategy: move towards the target location x T , butscale the speed according to the local congestion. Consider the two congestionpenalties h := (cid:12)(cid:12) X ( t ) − E ˆ u [ X ( t )] (cid:12)(cid:12) , h := 1 | X ( t ) − E ˆ u [ X ( t )] | , (73)where X ( t ) is the second (the y -)component of the coordinate process, i.e. thecomponent in the direction perpendicular to the corridor walls. Stickiness is setto γ = 0 .
5. The choice of h in (73) means that we have set r f ( X t ) = X ( t ).21he corridor is split into 9 segments parallel with the corridor walls. Themean speed is estimated in each segment for four different values of c Γ and theresults corresponding to congestion penalty h and h are presented in Figure 1and 2, respectively. The profiles plotted in Figure 1 attains the concave shapeobserved by [49], mimicking the fast track in the middle of the lane. In Figure 2the profiles follow the convex shape observed by [19], taking into account thatmovement in the crowded center (mean of the group) is costly. When c Γ issmall, the pedestrians can travel further on the boundary for the same cost.Heuristically, the higher γ is the longer it takes for the pedestrian to re-enter D and therefore a high γ combined with a small c Γ yields the highest boundaryspeed. This effect is evident in the figures, where smaller values of c Γ resultsin higher mean speed at the boundary. We note that we are able to shape themean speed at the boundary by our choice of model parameters. Corridor segment S peed Value of c
Figure 1: Mean speed in 9 segments of the corridor when h = h , estimatedfrom 4000 realizations of the controlled coordinate process. In this paper, we propose a variation of the mean-field approach to crowd mod-eling based on sticky reflected SDEs which to the best of our knowledge is22
Corridor segment S peed Value of c
Figure 2: Mean speed in 9 segments of the corridor when h = h , estimatedfrom 4000 realizations of the controlled coordinate process.new. The proposed model accounts for pedestrians that spend some time at theboundary and that have the possibility to choose a new direction of motion.We provide conditions for the proposed dynamics to admit a unique weaksolution, which is the best we can hope for (cf. [22]). Then, we consider mean-field type optimal control of the proposed dynamic model and give necessaryconditions for optimality with a Pontryagin-type stochastic maximum principle.There is a microscopic interpretation of the model even on the boundary ofthe domain and thus it has the potential to approximate optimal/equilibriumbehavior of a pedestrian crowd on a microscopic (individual) level. We verify apropagation of chaos result in the uncontrolled case.Pedestrians do often see and react to walls at a distance. This has beenstudied empirically, experiments are mentioned in the introduction. Force-basedmodels can implement repulsing potential forces spiking to infinity at boundariesto keep the pedestrians away from the walls and inside the domain, effectivelymaking it impossible for any pedestrian to reach a wall. A ranged, nonlocal ,interaction with walls will have a smoothing effect on pedestrian density, just likenonlocal pedestrian-to-pedestrian interaction has, as is noted in [4]. Nonlocalinteraction is an important aspect of pedestrian crowd modeling, but cannotgive an answer to what will happen whenever a pedestrian actually reachesa wall. Interaction with walls at a distance can be included in our proposedmodel either in the drift, as is the case in force-based models, or through thecost functional, as in agent-based models.An extension of the proposed framework would be to let the pedestrian23ontrol its stickiness, i.e. its motion in the normal direction of the boundary atthe boundary. Stickiness is not necessarily a physical feature of the domain, butthe time spent on the boundary may be subject to the pedestrian’s preference.This aspect cannot be described by the proposed model, since the Girsanovchange of measure does not effect stickiness (cf. Remark 2). Another extensionwould be to consider the controlled diffusion case mentioned in the introduction. References [1]
Y. Achdou, M. Bardi, and M. Cirant , Mean field games models ofsegregation , Mathematical Models and Methods in Applied Sciences, 27(2017), pp. 75–113.[2]
G. Albi, Y.-P. Choi, M. Fornasier, and D. Kalise , Mean field controlhierarchy , Applied Mathematics & Optimization, 76 (2017), pp. 93–135.[3]
M. Amir , Sticky Brownian motion as the strong limit of a sequence of ran-dom walks , Stochastic processes and their applications, 39 (1991), pp. 221–237.[4]
A. Aurell and B. Djehiche , Mean-field type modeling of nonlocal crowdaversion in pedestrian crowd dynamics , SIAM Journal on Control and Op-timization, 56 (2018), pp. 434–455.[5]
M. Bardi and M. Cirant , Uniqueness of solutions in mean field gameswith several populations and Neumann conditions , in PDE Models forMulti-Agent Phenomena, Springer, 2018, pp. 1–20.[6]
N. Bellomo and L. Gibelli , Toward a mathematical theory ofbehavioral-social dynamics for pedestrian crowds , Mathematical Models andMethods in Applied Sciences, 25 (2015), pp. 2417–2437.[7] ,
Behavioral crowds: Modeling and Monte Carlo simulations towardvalidation , Computers & Fluids, 141 (2016), pp. 13–21.[8]
R. Buckdahn, B. Djehiche, and J. Li , A general stochastic maximumprinciple for SDEs of mean-field type , Applied Mathematics & Optimiza-tion, 64 (2011), pp. 197–216.[9]
R. Buckdahn, J. Li, and J. Ma , A stochastic maximum principlefor general mean-field systems , Applied Mathematics & Optimization, 74(2016), pp. 507–534.[10]
M. Burger, M. Di Francesco, P. Markowich, and M.-T. Wol-fram , Mean field games with nonlinear mobilities in pedestrian dynamics ,Discrete & Continuous Dynamical Systems-B, 19 (2014), pp. 1311–1333.2411]
M. Burger, M. Di Francesco, P. A. Markowich, and M.-T. Wol-fram , On a mean field game optimal control approach modeling fast exitscenarios in human crowds , in 52nd IEEE Conference on Decision andControl, IEEE, 2013, pp. 3128–3133.[12] ,
Mean field games with nonlinear mobilities in pedestrian dynamics ,Discrete and Continuous Dynamical Systems-Series B, 19 (2014), pp. 1311–1333.[13]
M. Burger, S. Hittmeir, H. Ranetbauer, and M.-T. Wolfram , Lane formation by side-stepping , SIAM Journal on Mathematical Analysis,48 (2016), pp. 981–1005.[14]
M. Burger, P. Markowich, and J.-F. Pietschmann , Continuouslimit of a crowd motion and herding model: analysis and numerical simu-lations , Kinet. Relat. Models, 4 (2011), pp. 1025–1047.[15]
R. Chitashvili , On the nonexistence of a strong solution in the boundaryproblem for a sticky Brownian motion , CWI, Centrum voor wiskunde enInformatica= Centre for Mathematics and . . . , 1989.[16]
S. E. Choutri, B. Djehiche, and H. Tembine , Optimal control andzero-sum games for Markov chains of mean-field type , Mathematical Con-trol and Related Fields, (2018).[17]
M. Cirant , Multi-population mean field games systems with neumannboundary conditions , Journal de Math´ematiques Pures et Appliqu´ees, 103(2015), pp. 1294–1315.[18]
E. Cristiani, F. S. Priuli, and A. Tosin , Modeling rationality to controlself-organization of crowds: an environmental approach , SIAM Journal onApplied Mathematics, 75 (2015), pp. 605–629.[19]
W. Daamen and S. P. Hoogendoorn , Flow-density relations for pedes-trian traffic , in Traffic and granular flow’05, Springer, 2007, pp. 315–322.[20]
B. Djehiche and S. Hamad`ene , Optimal control and zero-sum stochas-tic differential game problems of mean-field type , Applied Mathematics &Optimization, (2018), pp. 1–28.[21]
B. Djehiche, A. Tcheukam, and H. Tembine , A Mean-Field Game ofEvacuation in Multilevel Building , IEEE Transactions on Automatic Con-trol, 62 (2017), pp. 5154–5169.[22]
H.-J. Engelbert and G. Peskir , Stochastic differential equations forsticky Brownian motion , Stochastics An International Journal of Probabil-ity and Stochastic Processes, 86 (2014), pp. 993–1021.[23]
W. Feller , The parabolic differential equations and the associated semi-groups of transformations , Annals of Mathematics, (1952), pp. 468–519.2524] ,
Diffusion processes in one dimension , Transactions of the AmericanMathematical Society, 77 (1954), pp. 1–31.[25]
W. Feller et al. , Generalized second order differential operators andtheir lateral conditions , Illinois journal of mathematics, 1 (1957), pp. 459–504.[26]
C. Graham , The martingale problem with sticky reflection conditions, anda system of particles interacting at the boundary , Annales de l’IHP Proba-bilit´es et statistiques, 24 (1988), pp. 45–72.[27]
M. Grothaus and R. Voßhall , Strong Feller property of sticky reflecteddistorted Brownian motion , Journal of Theoretical Probability, 31 (2018),pp. 827–852.[28]
M. Grothaus, R. Voßhall, et al. , Stochastic differential equationswith sticky reflection and boundary diffusion , Electronic Journal of Proba-bility, 22 (2017).[29]
J. M. Harrison and A. J. Lemoine , Sticky Brownian motion as the limitof storage processes , Journal of Applied Probability, 18 (1981), pp. 216–226.[30]
D. Helbing and P. Molnar , Social force model for pedestrian dynamics ,Physical review E, 51 (1995), p. 4282.[31]
J. J. A. Hosking , A stochastic maximum principle for a stochastic dif-ferential game of a mean-field type , Applied Mathematics & Optimization,66 (2012), pp. 415–454.[32]
E. P. Hsu , Stochastic analysis on manifolds , vol. 38, American Mathemat-ical Soc., 2002.[33]
R. Hughes , The flow of large crowds of pedestrians , Mathematics andComputers in Simulation, 53 (2000), pp. 367–370.[34]
K. Itˆo, H. P. McKean, et al. , Brownian motions on a half line , Illinoisjournal of mathematics, 7 (1963), pp. 181–231.[35]
O. Kallenberg , Foundations of modern probability , Springer, New York,1997.[36]
I. Karatzas and S. E. Shreve , Brownian Motion and Stochastic Cal-culus , Springer, 1988.[37]
A. Kirchner and A. Schadschneider , Simulation of evacuation pro-cesses using a bionics-inspired cellular automaton model for pedestrian dy-namics , Physica A: statistical mechanics and its applications, 312 (2002),pp. 260–276. 2638]
A. Lachapelle and M.-T. Wolfram , On a mean field game approachmodeling congestion and aversion in pedestrian crowds , Transportation re-search part B: methodological, 45 (2011), pp. 1572–1589.[39]
D. Lacker et al. , On a strong form of propagation of chaos for McKean-Vlasov equations , Electronic Communications in Probability, 23 (2018).[40]
J. Ma, W.-g. Song, Z.-m. Fang, S.-m. Lo, and G.-x. Liao , Experi-mental study on microscopic moving characteristics of pedestrians in builtcorridor based on digital image processing , Building and Environment, 45(2010), pp. 2160–2169.[41]
S. Peng , A general stochastic maximum principle for optimal control prob-lems , SIAM Journal on control and optimization, 28 (1990), pp. 966–979.[42]
H. M. Soner, N. Touzi, and J. Zhang , Wellposedness of second or-der backward SDEs , Probability Theory and Related Fields, 153 (2012),pp. 149–190.[43]
A.-S. Sznitman , Topics in propagation of chaos , in Ecole d’´et´e de proba-bilit´es de Saint-Flour XIX—1989, Springer, 1991, pp. 165–251.[44]
A. D. Venttsel’ , On boundary conditions for multidimensional diffusionprocesses , Theory of Probability & Its Applications, 4 (1959), pp. 164–177.[45]
J. Warren , Branching processes, the Ray-Knight theorem, and stickyBrownian motion , in S´eminaire de Probabilit´es XXXI, Springer, 1997,pp. 1–15.[46]
S. Watanabe and N. Ikeda , Stochastic differential equations and diffu-sion processes , Elsevier Science, 1981.[47]
K. Yamada , Reflecting or sticky Markov processes with Levy generators asthe limit of storage processes , Stochastic Processes and their Applications,52 (1994), pp. 135–164.[48]
J. Yong and X. Y. Zhou , Stochastic controls: Hamiltonian systems andHJB equations , vol. 43, Springer Science & Business Media, 1999.[49]
F. Zanlungo, T. Ikeda, and T. Kanda , A microscopic “social norm”model to obtain realistic macroscopic velocity and density pedestrian distri-butions , PloS one, 7 (2012), p. e50720.[50]