AAn Inattention Model for Traveler Behavior with e-Coupons
Han QiuJanuary 17, 2019
Abstract
In this study, we consider traveler coupon redemption behavior from the perspective of anurban mobility service. Assuming traveler behavior is in accordance with the principle of utilitymaximization, we first formulate a baseline dynamical model for traveler’s expected future tripsequence under the framework of Markov decision processes and from which we derive approxima-tions of the optimal coupon redemption policy. However, we find that this baseline model cannotexplain perfectly observed coupon redemption behavior of traveler for a car-sharing service.To resolve this deviation from utility-maximizing behavior, we suggest a hypothesis that trav-elers may not be aware of all coupons available to them. Based on this hypothesis, we formulate aninattention model on unawareness, which is complementary to the existing models of inattention,and incorporate it into the baseline model. Estimation results show that the proposed model betterexplains the coupon redemption dataset than the baseline model. We also conduct a simulationexperiment to quantify the negative impact of unawareness on coupons’ promotional effects. Theseresults can be used by mobility service operators to design effective coupon distribution schemesin practice.
Keywords: inattention, Markov decision process, utility maximization
Recently, incentive-based demand management methods have become popular in urban mobility busi-nesses. The most well-known approach is dynamic pricing. On one hand, service operators can mitigatethe shortage of supply in real time with surge pricing [9], which is first deployed by Uber on a large scaleand now commonly used by ride-sharing platforms. On the other hand, operators can use real-timeprice discount to attract users and compete with alternative mobility services. Another commonlyused tool is the coupons, which are vouchers that guarantee rights of fare reductions in consumption ofproducts or services. Because coupons can change the fare only in the negative direction, their usageis usually limited to promotions. However, coupons are able to inform travelers about the possibilityof fare reduction before they submit trip requests, while a traveler can be aware of any real-time pricediscount only after the service operator provides an offer. Therefore, coupons are a good complementto pricing for urban mobility service operators.Nevertheless, the impact of coupons on traveler behavior is not immediately obvious. Unlikepricing, coupons usually have long life cycles and can have complicated redemption rules. A travelermay then develop sophisticated coupon redemption policy to optimize her aggregate utility from trips.For example, when the fare of the current trip is much lower than the face value of a coupon, a travelerwill defer the redemption of this coupon to a later trip. The situation becomes increasingly complexwhen the traveler is presented with various coupons and some coupons may have their face values,expiration dates, and redemption rules different from those of others. For example, the traveler canhave one coupon that can reduce the fare of one trip for up to 5 dollars and another to reduce the fareof one trip by 20%.Moreover, in a typical urban mobility service setting, coupons are involved in several differentdecision problems such as the travel mode selection and the coupon selection for payment. Because1 a r X i v : . [ ec on . T H ] D ec he service operator does not have complete information about the traveler, structural estimation ofcoupon impact on some decision problems is not practical. For instance, operator of one mobilityservice is not able to distinguish between the promotional effect in market share from her own coupondistribution strategy and the one from coupon distribution strategies of alternative mobility services.In this case, the operator can only assess the impact of coupons on the travel mode selection behaviorwith direct experiments on coupon distribution strategies and the result is generally not stable withrespect to changes in operations of alternative mobility services.Recognizing these difficulties, in this study, we focus on travelers’ coupon selection behavior.Coupon selection is a relatively simple decision problem compared with others because the decisionmostly depends on the traveler’s evaluation of coupons. Moreover, for this problem, we are able toacquire adequate data for model estimation.Specifically, in this study, we consider a setting in which an operator provides an urban mobilityservice via a mobile app and occasionally distributes electronic coupons (e-coupons) to attract travelersand to boost revenue. Moreover, to focus on the traveler behavior, we assume that travelers’ decisionsare independent of the operator’s coupon distribution strategy. The e-coupon is in a simple format:each e-coupon can be represented by its face value and expiration date. These e-coupons are stored inan online electronic wallet (e-wallet) within the app and visually accessible only when travelers opentheir e-wallet. After a trip, the traveler can select up to one coupon for redemption, and a coupon canreduce the payment by subtracting from the trip fare its face value, at most.In selecting a coupon for redemption, the traveler evaluates for each coupon both its immediateredemption value and the future redemption value of the rest of coupons. To infer these evaluations,we first notice that the immediate redemption value can be obtained directly from the definition ofa coupon. Then, we develop a dynamical model of trip sequences to estimate the values of futurecoupon redemption, because these values depend on trips happened during the life cycle of the couponset. In particular, we view the trip generation and realization process (expected by the traveler) as aMarkov decision process (MDP) and derive value approximations by assuming the utility-maximizationbehavior of travelers and solving for the optimal value function of the MDP. The resulting optimalpolicy also dictates the coupon selection behavior of a utility-maximization traveler.However, we find that the observed coupon selection behavior of travelers for a car-sharing servicedeviates from the above model. To explain this finding, we discuss several possibilities and finally comeup with the hypothesis that travelers may not be aware of all available coupons. We then provide amathematical formulation for such patterns of unawareness and incorporate it into the aforementionedMDP.Subsequent estimation results show that the proposed unawareness model better explains thecoupon redemption dataset than baseline models. A simulation experiment further shows that ifsuch unawareness exists the reduction rate on coupons’ promotional effects can be as great as 10%.The contribution of this work is twofold. First, the proposed unawareness model is complementaryto the existing models of inattention and they have completely different behavioral implications. There-fore, when existing inattention models fail to explain some dataset, one can consider the unawarenessmodel. Second, our estimation results can be used by mobility service operators in designing effectivecoupon distribution schemes in practice.The rest of this paper is organized as follows. In Section 2, we summarize related works on coupons,estimation of dynamic behavior, and limited attention. In Section 3, we formulate the dynamicalmodel for the trip sequences and derive the optimal coupon redemption policy and the correspondingevaluation of available coupons. In Section 4, we describe the dataset from a car-sharing service andshow the discrepancies between the observed coupon redemption behavior and decisions dictated by theoptimal policy. In Section 5, we extend the baseline dynamical model with a mathematical formulationof unawareness. In Section 6, we summarize the estimation results of the proposed model. In Section7, we conduct a simulation experiment to assess the impact of unawareness on the promotional effectof coupons. Finally, in Section 8, we conclude our work and suggest directions for future research.In later discussions, we use the terms “inattention” and “unawareness” interchangeably. Moreover,general inattention behavior that is not explicable by unawareness is denoted as “deliberate attention”.2 Literature Review
As a major type of price promotions, coupons have been studied in marketing literature for decades,resulting in numerous theoretical and empirical papers. Here, we provide only a brief summary of themajor focus of this literature and highlight the novelty of our study.Decades ago, merchants usually designed coupons with simple redemption rules. For example, agrocery store can issue coupons for a specific product with the same face value and valid period. Giventhis simplicity, the optimal coupon redemption strategy is trivial to compute. Therefore, researchon customer behavior at that time mainly focused on customers’ coupon proneness , or the (latent)intention of obtaining and redeeming coupons. Such coupon proneness was usually estimated fromsocio-economic factors [5, 22, 25]. Simple redemption rules also lead to aggregate coupon redemptionpatterns that can be described by some elementary functions. For example, Ward and Davis [38]suggested that after coupon issuance, the coupon redemption rate declines exponentially as time passes.Inman and McAlister [21] further considered the impacts of expiration dates and extended the abovemodel with a hyperbolic function.With the rising popularity of online shopping, merchants now prefer to use e-coupons over tradi-tional paper coupons in price promotions. This change leads to two patterns. First, because e-couponscan be distributed with low cost and in large-scale, merchants can now reach out to a large number ofcustomers and each customer may receive coupons from a variety of merchants. Second, because theredemption of e-coupons is processed by computers rather than human beings, merchants are now ableto develop complicated redemption rules. Under these patterns, the structural estimation of customerbehavior with coupons becomes considerably more difficult. Therefore, recent research mostly focusedon reduced-form models or even data-driven approaches. For example, Reimers and Xie [32] proposedreduced-form models for coupons’ market expansion and revenue cannibalization effects. The authorsestimated their models using restaurant coupon data from Groupon. Zhang et al. [39] investigated theshort and long-term effects of coupon distributions on customer behavior by conducting randomizedfield experiments on the Taobao Marketplace, the largest online C2C platform in China. The authorsapplied linear models to explain their experimental findings at an aggregate level.Our work differs from the ones mentioned above in that we consider a structural model undergeneral coupon redemption schemes. We point out that the proposed dynamical structural model issimilar to the ones used in the estimation of multi-stage household consumption [8, 15]. We brieflydiscuss this connection in the next subsection.
When people model real-world human behavior, they usually assume that human behaves according tothe utility-maximization principle. Then, the inference problem can be reduced to a model estimationproblem for the utility function. However, when the observations come from some sequential decisionprocesses, the corresponding utility-maximizing policy needs to consider not only the utility fromimmediate actions but also those in the future. In this case, the connection between the (single-step)utility function and the (multiple-step) utility-maximizing policy is not immediately obvious.In the field of econometrics, such estimation problems are generally framed as the dynamic discretechoice (DDC) problems. The first practical estimation algorithm for DDC models, the nested fixedpoint algorithm, is proposed by Rust [33] in the 1980s to describe the engine replacement behavior ofa bus company. This estimation algorithm refers to a two-stage optimization process: first, find theoptimal value function V corresponding to the utility function determined by θ ; then, do local searchesfor a better θ according to the estimates V . The huge computational burden makes this algorithmintractable for more general use, and from then on more computationally efficient algorithms, such asHotz-Miller’s conditional choice probability method [20], have been suggested. Interested readers arereferred to Aguirregabiria and Mira [3] and Heckman and Navarro [16] for more details on these recent3evelopments.The estimation algorithms of dynamic discrete choice models share many similarities with the re-ward learning methods for the inverse reinforcement learning (IRL) problems. Compared with dynamicdiscrete choice models, these methods assume less structural knowledge of the utility (reward) functionand allow for more freedom on the choice of the function form. For instance, we can use deep neuralnetworks to capture the complex relationship between the state and the reward. However, in thiscase, the estimation problem is generally ill-posed [31, 40] and one needs to add other regularizationor penalty terms to obtain meaningful reward functions. For example, Ziebart et al. [40] construct anentropy-regularized maximum likelihood estimator for IRL problems; Abeel and Ng [2] consider an op-timization problem to find the maximum margin hyperplane that separates the expert demonstrationsfrom other non-optimal policies.Reward learning methods have been the major research focus of IRL problems since the last decade;however, recently, there have been more interests in methods that train policy from demonstration di-rectly. In particular, several papers [11, 12, 17] suggest that the optimal policy corresponded tothe recovered reward function from a reward learning method can be viewed as the policy learnedfrom behavioral cloning (supervised learning) of the observed behavior under the regularization con-dition uniquely determined by the same method. This interpretation is then used to develop severalgenerative-adversarial-network (GAN) based IRL methods for simultaneous policy learning and re-ward learning, and these methods achieve better learning performance compared with state-of-the-artbaselines.In this paper, we avoid the aforementioned difficulties in utility estimation by direct approximations.Specifically, we assume that the utility is additively separable: the total utility from each trip can bedecomposed as the sum of the utility from coupon redemption, which is in the monetary unit andthere is no need for estimation, and the unknown utility from the trip itself. Then, we show bymathematical derivation that the unknown utility from the trip can be safely ignored under certainregularity conditions. Under these assumptions, we can approximate the optimal value function of thesequential decision problem with a value determined only by the values of immediate or future couponredemption, and the computation can be done at once before model estimations. One novelty of our work is that we explicitly model the impact of unawareness in travelers’ decisiondynamics. In this subsection, we review the related literature on discrete choice problems under limitedattention.Discrete choice problems under limited attention can generally be described by a three-stage deci-sion process [35]. First, an awareness subset is drawn from the whole choice set by chance. Then thedecision maker (DM) deliberately limits her attention to a consideration subset of the awareness set .Finally, the DM makes a choice within the consideration set . Most works in the literature focus only onthe second stage, i.e. the generation of a consideration set , possibly limited by available data or by theproblems with identification. Recently, theoretical works, including the one of Masatlioglu et al. [30],introduced frameworks that viewed the awareness set and the consideration set as an individual object.However, as illustrated later in this paper, these two terms should not be used interchangeably in adynamic decision model because they are generated according to different mechanisms. In particular,the generation of an awareness set is an action taken by nature and needed to be explicitly modeledwith the dynamical model, whereas the generation of an consideration set is an action taken by theDM and is implicitly modeled in a class of decision strategies(policies). In this study, we specificallyfocus on the modeling of the former.The modeling of discrete choice problems under limited attention started in the 1970s [27]. At thattime, researchers were interested in theoretically attractive extensions of classic discrete choice models,e.g., the multinomial logit (MNL) model. As an initial work, Manski [27] introduced a random setmodel, in which each choice is independently considered for attention. The computational complexityof this model scales with the power of the choice set size because the consideration set can be any4ubset of the whole choice set. This model is referred to as the “Manski model” in the discussion below.Later, Swait and Ben-Akiva [37] developed the parameterized logit captivity (PLC) model, in whichthe consideration set can either be the whole choice set or contain a single choice option. Empiricalworks [6, 37] showed that these models had better explanatory power than the pure MNL model.Since the last two decades, customers have been able to browse and purchase an increasing numberof products either online or via mobile apps, thanks to the developments of information technology.With very large choice sets, consumers exhibit decision patterns that deviate much from rationalitybut can possibly be explained by limited attention. Consequently, there had been rising interests inunderstanding and exploiting such behavior, and we witnessed a burgeoning number of empirical workson consideration sets . For example, Chiang et al. [10] estimated a random-parameter extension of theManski model using data on households’ choices among four ketchup brands. Goeree [14] estimated theManski model using data on customer choices among personal computer products in the US. Honka[18] applied the concept of searching costs to develop an attention model and estimated it using adataset of customer choices among automobile insurance products in the US. Honka et al. [19] furtherextended the above model by including the Manski model for the awareness set generation. Theauthors estimated the resulting model using data on customer choices among bank accounts in the US.All of these works claimed that the inclusion of the set consideration stage leads to better specificationsand estimation results.At the same time, other scholars kept progressing in the theoretical development of the limitedattention mechanism. Manzini and Mariotti [28], Masatlioglu et al. [30], and Abaluck and Adams[1] focused on the axiomatic formulation of the consideration set , aiming at extending the currentpreference theory. Sims [36], Kim et al. [23], and Gabaix [13] considered information costs in the DM’ssearch for choice options and developed models of rational inattention. Masatlioglu and Nakajima [29]and Seiler[34] further extended these optimization models of option searching into dynamic decisionprocesses. For a comprehensive summary of recent theoretical developments, interested readers arereferred to the recent work by Masatlioglu et al. [30].Restricted by application scenarios, the modeling of consideration sets had only been applied fortraveler mode choice or location choice behavior in the transportation literature. For example, Swaitand Ben-Akiva [37] estimated the PLC model using travel mode choice data in Sao-Paulo, Brazil.Ba¸sar and Bhat [4] estimated the Manski model using data on passenger choices among four airportsin the San Francisco Bay Area. Mahmoud et al. [26] estimated the PLC model with a dataset of travelmode choices in the city of Toronto.Our inattention model is distinct from the above models in two aspects. First, we consider behaviorunder limited attention in a dynamic decision process. Second, we pay more attention to the generationof awareness sets than to the generation of consideration sets . Simulation results in later sections showthat such modeling differences actually lead to important practical implications.
In this section, we formulate a dynamical model for trip sequences to derive the utility-maximizingcoupon redemption policy and the corresponding evaluation of available coupons. Without loss ofgenerality, in following discussions we always assume that utilities are in the monetary unit. Themodel formulation is decomposed into two parts. First, we specify the temporal correlation betweenconsecutive trips. In particular, we model the trip generation process as a discrete time dynamicalsystem with the available coupon set being the only state variable that keeps changing across trips.Secondly, we construct a structural model for individual trips. Specifically, we view an individual tripas a combination of two stages: the travel mode selection stage and the coupon selection stage. Theoverall structure of the dynamical model is described in Figure 1. One should notice that not allelements in this figure are observable from our perspective; for instance, a trip will be observed onlyif the target mobility service is chosen.Next, we summarize several general assumptions about the dynamical model that mentioned ex-5 … n n +1 n +2 n +3 trip 1 trip 2 trip 3 state transitions estimated trip details X travel mode selection realized trip details X’ coupon selection i f c hoo s e s t h e t a r g e t s e r v i ce
22 23 at the start of the trip at the end of the trip
Figure 1: Structure of the dynamical model for trip sequencesplicitly earlier in this paper.
Assumption 1. (a) Each coupon is represented by its face value and expiration date and can beused freely before expiration. However, for each payment at most one coupon can be redeemed,and the final trip fare must be nonnegative (that is, the reduction in fare cannot exceed the fareitself ).(b) The traveler only considers trip events for the future; therefore, the traveler has no expectationon future coupon arrivals.
Later in Section 4, we will show that both assumptions are reasonable for our dataset.
Before proceeding to the model formulation, we first discuss the notations of coupons. A coupon ˜ c consists of a face value v and the remaining time to expire T , whereas an available coupon set includesmany coupons. However, directly modeling coupon set C as a set of coupons { ˜ c , · · · , ˜ c m } does notalways work here, because a mathematical “set” requires its every element to be distinct, but a travelerusually has several coupons with the same v and T .To overcome this difficulty, we model coupons in groups: c = (cid:104) v, T, n (cid:105) ∈ R × N × N + , where n isthe number of coupons in the group. A coupon set C can then be defined as a finite set of coupongroups C = { c , · · · , c m } ⊂ R × N × N + , which is subject to the restriction that any two coupongroups c i , c j in C cannot have the same characteristics ( v, T ). Moreover, the coupon set C shouldalways include the option of selecting no coupon; here, this default option is represented by a coupongroup of zero-valued coupons c = (cid:104) , , (cid:105) . We further define C as the default set { c } . We use C torepresent the set of all possible coupon sets.Now, a natural subset C a of the coupon set C is not necessarily a mathematical subset of C .However, given C = { c , · · · , c m } and c i = (cid:104) v i , T i , n i (cid:105) , we can characterize C a as follows: C a = { c ai | i ∈ I a } ∈ C , where the index set I a ⊂ { , , · · · , m } , and coupon group c ai = (cid:104) v i , T i , n ai (cid:105) with 0 < n ai ≤ n i .We use A ( C ) to denote the set of all possible subsets C a of C . A simple way to describe a trip sequence is to discretize the time into steps according to a unit t and allocate each trip to a time step. For example, we can generate the trip for traveler j in a time6tep according to a Bernoulli distribution B (1 , λ j ) (the probability of having a trip demand in eachtime step is λ j ) and assuming trips from different time steps are generated independently. However,the selection of an appropriate time unit t is not trivial in general: on one hand, when t is large,e.g., t = 1 week, it is unlikely that a traveler has no more than one trip within a time step; on theother hand, when t is small, e.g., t = 1 minute, a trip can last for many time steps and the temporalcorrelation between consecutive time steps can be strong. For instance, a traveler who just finisheda trip 10 mins ago is less likely to submit a trip request now. Also, the computation complexity ofmodeling trip sequences within a specific time range is inverse proportional to t and can lead topractical problems when t is very small.A more natural model of trip sequences is to consider a continuous time setting and use stochasticprocesses for trip generation. For example, Poisson processes generalize the independent Bernoullisampling process in the discrete time setting. However, this model also suffers from strong temporalcorrelations among trips and huge computation complexity. In fact, this model is closely related todiscrete time models with very small t , as shown in Appendix A.Because the ultimate purpose in this section is to derive practical estimations of the future re-demption value of coupons, here we choose to use a larger time unit t to reduce the modeling andcomputation complexities. In particular, we want to select a large enough t such that the followingassumption holds: Assumption 2.
For any mode i , the realized trip time t (cid:48) xi is always upper bounded by the time unit t t (cid:48) xi ≤ t . (1)This assumption says that a trip cannot last more than a time step. With this assumption, we avoidthe difficulty in modeling heterogeneous state transitions across time. For urban mobility services, a t greater than three hours is generally adequate for Assumption 2 to hold. In subsequent discussion,we always assume t = 1 day and write t = 1 for short.One cost of selecting a large t is that we will underestimate the total number of trips and, therefore,the value of future coupon redemption. Luckily, as will be shown later in Section 4, for our datasetmost of the travelers are low-frequency users of the car-sharing service; therefore the underestimationof future coupon redemption values is not a severe problem.Next we discuss the trip generation process in each time step. First, define λ j,t as the probabilityof traveler j having a trip demand in time step t . In general, λ j,t depends on all the past information( λ j,t − , · · · , λ j, , S j,t − , · · · , S j, ), in which S j,t is state variables for j at time t . However, inclusion ofthese past information into the trip generation process leads to several practical issues: trips served byalternative modes are not observed, so the past information for λ j,t is always incomplete; even if we havecomplete information, the dependency can be highly nonlinear, which leads to a difficult estimationprocedure and unstable predictions into the future. Therefore, we make the following assumption forsimplification: Assumption 3.
Trip generation rate λ j,t does not rely on any past trip of traveler j and is fixed as λ j . In practice, we can let λ j be traveler j ’s average trip generation rate. Intuitively, the estimationof coupon redemption value from this approximation is reasonable when both the real λ j,t does notdiffer much from λ j and the time range T in consideration is large.In summary, in this study we consider a discrete time setting for trip generations, in which thetime unit t is 1 day and the probability of having a trip demand in each time step is a traveler-specificconstant λ j .Finally, we discuss the state transition of the available coupon set C between consecutive trips. C isupdated in two steps: first, if there is a coupon redemption c (cid:54) = c , the used coupon ˜ c is removed fromthe set C ; secondly, every remaining coupon ˜ c (cid:48) in the set C becomes one-time-unit closer to expiration.This updating procedure is described by the following state transition function f and coupon group7ransition function f c : f ( C, c ) = { f c ( c (cid:48) ) | c (cid:48) ∈ C/ { c }} ∪ { f c ( (cid:104) v, T, n − (cid:105) ) } f c ( (cid:104) v, T, n (cid:105) ) = (cid:40) (cid:104) v, T − , n (cid:105) v, n > , T ≥ c otherwise (2)where we assume that c = (cid:104) v, T, n (cid:105) . Notice that in general cases the update of time to expiration T depends on t x ; here the homogeneity is ensured by Assumption 2.When there is no coupon selection, the state transition is described by f ( C, c ). For simplicity, weuse f ( C ) to represent this default state transition f ( C, c ). In the following formulation, we limit our discussion to a specific traveler j .First, we simplify the interaction between the traveler and the target mobility service in an indi-vidual trip as follows: at the beginning of a trip, the traveler decides which travel mode to use andwhether to cancel the trip. If the traveler selects a mode, she sticks with her choice until she arrives atthe destination. Upon arrival, the traveler proceeds to payment and if she selects the target mobilityservice, she can decide which coupon to use. The trip ends after the payment. In other words, wedecompose an individual trip into two stages: travel mode selection and coupon selection.In the stage of travel mode selection, the detail of the trip demand X , including the trip distanceand the current traffic situation, is revealed to the traveler. Without loss of generality, we use atraveler-specific distribution P j ( · ) to describe the generation of X . Moreover, following the same logicas the discussion on λ j in the last subsection, we make the following assumption: Assumption 4.
Distribution P j ( · ) does not rely on any past trip of traveler j . Given X , the traveler selects a travel mode or cancels the trip according to a traveler-specific policy π xj and the mode-specific information such as available coupons for different mobility services. Withoutloss of generality, let us assume that the potential travel modes are indexed as 0 , , · · · , n j , where 0corresponds to trip cancellation and 1 corresponds to the target mobility service. The probability ofselecting mode i can then be described as P ( i ) = π xj ( i | X, I , · · · , I n j ) , (3)where I i captures all private information about mode i of traveler j .Because the specific selection probabilities P (0) , P (2) , · · · , P ( n j ) are not available to us, our dis-cussion on the travel mode selection is restricted to the event { i = 1 } (whether the target mobilityservice is selected). Moreover, because we do not have the information I , I , · · · , I n j , we need thefollowing assumption for subsequent developments: Assumption 5.
Information on the alternatives I , I , · · · , I n j are invariant among trips for anytraveler j . We also make an assumption to simplify the form of I : Assumption 6.
Information on the target mobility service I can be fully described by the availablecoupon set C ; other factors, such as the service quality, are assumed to be invariant among trips forany traveler j . Later in Section 4, we will show that both assumptions 5 and 6 can be partly justified for ourdataset.Next, we make an assumption to simplify the form of the traveler’s decision policy π xj :8 ssumption 7. In selecting a travel mode, the traveler evaluates the utility u ij ( X, I i ) ∈ R for eachmode i and her decision depends exclusively on these utilities; that is, the policy π xj has the followingform π xj ( i | X, I , · · · , I n j ) = ˜ π xj ( i | u j ( X, I ) , · · · , u n j j ( X, I n j )) . (4)The above three assumptions immediately lead us to the form π xj ( i | X, C ) = ˜ π xj ( i | u j ( X ) , u j ( X, C ) , · · · , u n j j ( X )); (5)to simplify notations, in following discussion we use u xij to denote u ij ( X ) and u xj to denote { u x j , · · · , u xn j j } .The next assumption is crucial for the computational tractability of our subsequent analysis. Assumption 8.
The utility from taking the target mobility service u j ( X, C ) is the sum of the utilityfrom the service itself u x j and the utility from potential coupon redemption u ( p x , C ) , where p x ∈ R + is the estimated trip fare with the target mobility service. In summary, with above simplifications, the travel mode selection policy now reduces to P ( i = 1) = π xj ( p x , u xj , C ) (6)Next, we consider the stage of coupon selections. Assumption 1(a) restricts the action space of thetraveler and the traveler’s decision can now be interpreted as a probability distribution over the setof available coupons. Without loss of generality, the coupon selection probability can be expressed as P (˜ c ) = π ˜ c (˜ c | X (cid:48) , C ), where π ˜ c is the coupon selection policy and X (cid:48) captures the realized trip details. Insubsequent discussion we use distribution P (cid:48) j to describe the generation of X (cid:48) from X : X (cid:48) ∼ P (cid:48) j ( ·| X ).Because the traveler cannot distinguish among coupons in the same group c , we make the followingassumption to simplify the form of π ˜ c : Assumption 9.
The traveler makes selection in two steps. First, she selects one coupon group c fromher available coupon set C according to policy π c : P ( c ) = π c ( c | X (cid:48) , C ) . Then, she chooses a coupon ˜ c from this group c . Next we simplify the form of X (cid:48) in π c . To make an optimal decision, the traveler needs to evaluateboth the utility from immediate redemption and the one from future redemption. On one hand, bydefinition the value r of redeeming a coupon in the group (cid:104) v, T, n (cid:105) given the realized trip fare p (cid:48) x is r ( p (cid:48) x , (cid:104) v, T, n (cid:105) ) = min( v, p (cid:48) x ). On the other hand, because of assumptions 3 and 4 the utility of futureredemption does not depend on the details X (cid:48) of the current trip. Therefore, we can make the followingassumption: Assumption 10.
In the coupon selection stage the traveler only consider the realized trip fare p (cid:48) x andthe coupon set C for decision; that is, π c ( c | X (cid:48) , C ) = π c ( c | p (cid:48) x , C ) . In this subsection, we derive optimal policy and value function of the above dynamical model anddevelop practical approximations to characterize travelers’ coupon redemption behavior.Suppose that the mode choice policy π xj and the coupon redemption policy π c are given. Sinceour model includes several stages, to make subsequent discussions clearer, we introduce followingdefinitions: U π ( C ) is the expected utility gain from the target mobility service and coupon set C at the beginning of the time step and is called the “ ex ante utility”; U πxj ( p x , u xj , C ) is the expectedutility gain after the revealing of trip demand details and is called the “interim utility”; U πc ( p (cid:48) x , C ) isthe expected utility gain at the end of trip realization stage and is called the “ ex post utility”.First, depending on whether there is a trip in the time step, the realized ex ante utility U π ( C )equals to either the ex ante utility at the next time step U π ( f ( C )), or the expected interim utility E X | P j U πxj ( p x , u xj , C ): U π ( C ) = (1 − λ j ) γU π ( f ( C )) + λ j E X | P j U πxj ( p x , u xj , C ) , (7)9here γ is the time discount factor for a time step.Next, depending on whether the target mobility service is selected, the realized interim utility U πxj ( p x , u xj , C ) equals to either the sum of utility from alternatives and the ex ante utility at the nexttime step U π ( f ( C )), or the sum of utility from the target mobility service u x j and the expected expost utility E X (cid:48) | X,P (cid:48) j U πc ( p (cid:48) x , C ): U πxj ( p x , u xj , C ) = (cid:88) i (cid:54) =1 P ( i )[ u xij + γU π ( f ( C ))] + P ( i = 1)[ u x j + E X (cid:48) | X,P (cid:48) j U πc ( p (cid:48) x , C )]= (1 − π xj ( p x , u xj , C ))[ u x ˜1 j + γU π ( f ( C ))] + π xj ( p x , u xj , C )[ u x j + E X (cid:48) | X,P (cid:48) j U πc ( p (cid:48) x , C )] , (8)where the value u x ˜1 j is equal to − P (1) (cid:80) i (cid:54) =1 P ( i ) u xij and can be interpreted as the expected utilityfrom taking alternative mobility services. Since we do not have any specific information on each ofalternative i , we assume that the value of u x ˜1 j is independent from the selection probability of thetarget mobility service P (1). Now, if we define the utility gain from taking the target mobility service u xj = u x j − u x ˜1 j , we have U πxj ( p x , u xj , C ) = (1 − π xj ( p x , u xj , C )) γU π ( f ( C )) + π xj ( p x , u xj , C )[ u xj + E X (cid:48) | X,P (cid:48) j U πc ( p (cid:48) x , C )] + u x ˜1 j , (9)Finally, the ex post utility U πc ( p (cid:48) x , C ) equals to different ex ante utility at the next time step,depending on which coupon the traveler selects to redeem: U πc ( p (cid:48) x , C ) = (cid:88) c ∈ C π c ( c | p (cid:48) x , C )[ r ( p (cid:48) x , c ) + γU π ( f ( C, c ))] . (10)For notational simplicity, in following discussion we use E X to replace E X | P j and E X (cid:48) to replace E X (cid:48) | X,P (cid:48) j .We next show that the above formulation leads to a technical problem and cannot be applieddirectly. In fact, we can derive the following corollary by simply replacing the coupon set C inequations (7), (9) and (10) with the default set C : Corollary 1.
We have U π ( C ) = 11 − γ λ j E X [ u x ˜1 j + π xj ( p x , u xj , C ) u xj ] . (11)The detailed proof is provided in Appendix B. This corollary says that when the time discountfactor γ is closed to 1, the ex ante utility U π ( C ) depends critically on γ and can be very large.However, this is often the case in the real world context: for example, if we let the yearly discountfactor be 0.9, the discount factor for a day is then 0.9997. Because the utility gain contributed bycoupon redemption is bounded by finite numbers, e.g., the sum of values of all coupons in the set, theestimation of coupon impacts, or the difference among U π ( C ) with different C , can be numericallyunstable.To achieve regularity in formulations and to reduce numerical instability, we simply subtract thevalue of U π ( C ) from every U π ( C ), as illustrated in the following proposition: Proposition 1.
If we define V π ( C ) as U π ( C ) − U π ( C ) and V πc ( p (cid:48) x , C ) as U πc ( p (cid:48) x , C ) − U πc ( p (cid:48) x , C ) ,we have V π ( C ) = γV π ( f ( C )) + λ j E X { π xj ( p x , u xj , C )[ u xj + E X (cid:48) | X V πc ( p (cid:48) x , C ) − γV π ( f ( C ))] − π xj ( p x , u xj , C ) u xj } ,V πc ( p (cid:48) x , C ) = (cid:88) c ∈ C π c ( c | p (cid:48) x , C )[ r ( p (cid:48) x , c ) + γV π ( f ( C, c ))] ,V π ( C ) = 0 . (12)10he proof of proposition 1 can be found in Appendix B. The new variable V π ( C ) can be interpretedas the net utility gain from the coupon set. In the following discussion, we call V π ( C ) the “ ex ante valuefunction” and V πc ( p (cid:48) x , C ) the “ ex post value function”. As we already tackle the regularity problemwith proposition 1, in the following exposition we always assume γ = 1.From Equations (7), (9), (10), and (12) it is not hard to derive the form of the optimal policies π ∗ c and π ∗ xj π ∗ c ( c | p (cid:48) x , C ) = I ( c = arg max c ∈ C { r ( p (cid:48) x , c ) + V ∗ ( f ( C, c )) } ) ,π ∗ xj ( t y , u xj , C ) = π ∗ xj ( t y , u xj , C ) = I ( u xj + E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) ≥ , (13)and the corresponding optimal value functions V ∗ and V ∗ c V ∗ c ( p (cid:48) x , C ) = max c ∈ C { r ( p (cid:48) x , c ) + V ∗ ( f ( C, c )) } ,V ∗ ( C ) = V ∗ ( f ( C )) + λ j E X [max( u xj + E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) , − max( u xj , ,V ∗ ( C ) = 0 , (14)where I ( · ) is the indicator function: I ( X ) = 1 if and only if the statement X is true.One more technical problem remains. To compute V ∗ exactly, we need to know parameter λ j anddistributions P j , P (cid:48) j accurately. This is not possible for observers like us; luckily, we can derive lowerand upper bounds of V ∗ as follows. Proposition 2.
Consider V L ( C ) = V L ( f ( C )) + E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 1)[ V Lc ( p (cid:48) x , C ) − V L ( f ( C ))] ,V Lc ( p (cid:48) x , C ) = max c ∈ C [ r ( p (cid:48) x , c ) + V L ( f ( C, c ))]; V U ( C ) = V U ( f ( C )) + E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 1)[ V Uc ( p (cid:48) x , C ) − V U ( f ( C ))] ,V Uc ( p (cid:48) x , C ) = max c ∈ C [ r ( p (cid:48) x , c ) + V U ( f ( C, c ))] , (15) with V L ( C ) = V U ( C ) = 0 and x (cid:48) being the binary indicator of whether there is a trip served by thetarget mobility service within the time step. We have V L ( C ) ≤ V ∗ ( C ) ≤ V U ( C ) for all C ∈ C . The proof of proposition 2 can be found in Appendix B. One important property of these ap-proximations is that λ j , P j , and P (cid:48) j do not show up explicitly in Equation (15). Therefore, we canapproximate V L and V U with the estimation of the joint distribution of x (cid:48) and p (cid:48) x from data. Thisleads to the following approximation of V ∗ :ˆ V ( C ) = ˆ V ( f ( C )) + ˆ λ j E p (cid:48) x | ˆ P j [ ˆ V c ( p (cid:48) x , C ) − ˆ V ( f ( C ))] , ˆ V c ( p (cid:48) x , C ) = max c ∈ C [ r ( p (cid:48) x , c ) + ˆ V ( f ( C, c ))] , ˆ V ( C ) = 0 . (16)where ˆ λ j is the estimated service selection probability P ( x (cid:48) = 1) and ˆ P j is the estimated marginaldistribution of fare p (cid:48) x .We point out that the formulation (16) is the most natural way to estimate the long-term value ofcoupons given only information on ˆ λ j and ˆ P j . Therefore, the arduous deductions above mostly providea sanity check that the straightforward modeling approach (16) is indeed valid under certain regularconditions.Next, we use a simple example to illustrate how we can practically compute ˆ V with equation (16).This method can be generalized to the computation of value functions in equations (14) and (15).11 xample 1. Consider a setting in which the service selection rate is ˆ λ j = 0 .
05 and the faredistribution ˆ P j can be described as log( p (cid:48) x ) ∼ N (3 . , . ). Assume now the traveler has two availablecoupons C = {(cid:104) , , (cid:105) , (cid:104) , , (cid:105)} .First, we enumerate all coupon sets that are possible in the future. Under the current simpli-fied setting, the complete list can be given explicitly: C , { c , (cid:104) , , (cid:105)} , { c , (cid:104) , , (cid:105)} , { c , (cid:104) , , (cid:105)} , { c , (cid:104) , , (cid:105) , (cid:104) , , (cid:105)} .Next, we compute ˆ V ( C (cid:48) ) for each possible C (cid:48) . We begin with C (cid:48) = { c , (cid:104) , , (cid:105)} ; since f ( C (cid:48) ) = C ,with equation (16) we haveˆ V c ( p (cid:48) x , C (cid:48) ) = max c ∈ C (cid:48) r ( p (cid:48) x , c ) = min( p (cid:48) x , , ˆ V ( C (cid:48) ) = ˆ λ j E p (cid:48) x | ˆ P j ˆ V c ( p (cid:48) x , C ) = 0 . E p (cid:48) x ∼ N (3 . , . ) min( p (cid:48) x , , (17)where the expectation in the second equation can then be computed numerically by sampling methods.Usually, direct sampling with 1,000 samples are enough for a good accuracy, but one can use moreadvanced methods such as importance sampling to improve computation efficiency. Similarly, we cancompute C (cid:48) = { c , (cid:104) , , (cid:105)} directly.Then, we illustrate how to compute C (cid:48) = { c , (cid:104) , , (cid:105)} . We first notice that value ˆ V ( f ( C (cid:48) )) =ˆ V ( { c , (cid:104) , , (cid:105)} ) is already available according to the above computations. We now apply equation(16) again and haveˆ V c ( p (cid:48) x , C (cid:48) ) = max(min( p (cid:48) x , , ˆ V ( f ( C (cid:48) ))) , ˆ V ( C (cid:48) ) = ˆ V ( f ( C (cid:48) )) + ˆ λ j E p (cid:48) x | ˆ P j [ ˆ V c ( p (cid:48) x , C ) − ˆ V ( f ( C (cid:48) ))]= ˆ V ( f ( C (cid:48) )) + 0 . E p (cid:48) x ∼ N (3 . , . ) max(min( p (cid:48) x , − ˆ V ( f ( C (cid:48) )) , , (18)which can then be evaluated by sampling methods.Finally, one can notice that by using the same approach, we can compute the value of V ( C ) bysampling methods when we know the value V ( f ( C, c )) for every c ∈ C . Since the transition f isunidirectional in time, this method is always feasible by backward deduction. (cid:3) The next example shows that when coupons have small promotional effects on the service selectionrate P ( x (cid:48) = 1) (e.g., the utility gain from coupons E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) is much smaller thanthe utility gain from trips u xj ), bounds V L and V U are close to each other and therefore ˆ V is a fairlyaccurate approximation. Example 2.
Consider a setting in which the default service selection rate P ( x (cid:48) = 1 | C ) = 0 . ∂∂V P ( x (cid:48) = 1) = 0 . p (cid:48) x ) ∼ N (3 . , . ). Suppose now the traveler has a set of coupons C = { c , (cid:104) , , (cid:105) , (cid:104) , , (cid:105) , (cid:104) , , (cid:105) , (cid:104) , , (cid:105)} .Figure 2 shows the estimated value functions: the subfigure (a) shows the traveler’s value functions V U ( f ( T ) ( C )) and V L ( f ( T ) ( C )) with respect to time T and the coupon set C ; the subfigure (b) showsthe differences in value functions ∆ V = V ( f ( T ) ( C )) − V ( f ( T − ( f ( C, c ))) with respect to the coupongroup c = (cid:104) , , (cid:105) for both V L and V U . From both subfigures, we can see that the gap between theupper bound and the lower bound is small. (cid:3) Before ending this subsection we point out another implication of equation (15) on the error struc-ture in estimations of the traveler’s coupon selection behavior. If we assume rationality from thetraveler, the coupon choice probability P ( c | p (cid:48) x , C ), from the perspective of the observer, is P ( c | p (cid:48) x , C ) = E V | D I ( c = arg max c ∈ C { r ( p (cid:48) x , c ) + V ( f ( C, c )) } ) , (19)where V is estimates of the optimal value function V ∗ given data D . In specifying the error structureof V , people usually assume the additive separability V = ˆ V + ε V , and that ε V is normally distributed(probit model) or follows Gumbel (Type-I extreme) distribution (logit model). The major reason forsuch choices is to obtain tractable analytic forms of the choice probability P ( c | p (cid:48) x , C ). However, because12 T v a l u e V U V L (a) V T . . . . . v a l u e V U V L (b) ∆ V Figure 2: Value function approximations
Mar 1st Apr 1st May 1st Jun 1st Jul 1st9.09.29.49.69.810.0 log(users/10)log(orders)
Figure 3: Daily order volume and number of registered travelersin our model the value V has both an upper bound V U and a lower bound V L , our specification of ε V should have bounded support. Specifications with probit or logit models can lead to erroneous results. Our data comes from a car-sharing service in Shanghai, China and is collected during the period fromJanuary 2017 to July 2017. The dataset includes activities from 0.16 million registered travelers andcontains more than 1.5 million trip records. Figure 3 shows that both the daily order volume andthe number of registered travelers increased steadily during the period, but the daily order volumeper traveler exhibited significant changes in its pattern on April 1st and June 10th. Such changes aredue to external events including the announcement of a new pricing scheme and summer holidays.Since these external events potentially result in changes of traveler behavior, to avoid estimation biaswe focus on the stable period from April 2017 to June 2017 in the discussion below. We notice thatthe stability of operation during the concerned period indicates that there is no significant change inthe operations of alternative travel modes and traveler’s preference of the target mobility service, andpartly justifies our previous assumptions 5 and 6.13
20 40 60 80 100trip quantity (in 180 days)0.00.20.40.60.81.0 c u m u l a t i v e p r o b a b ili t y observationslog normalgammaexponential (a) Trip quantities of users trip fare0.00.20.40.60.81.0 c u m u l a t i v e p r o b a b ili t y observationslog normalgammaexponential (b) Trip fare Figure 4: Distribution of major statisticsFrom part (a) of Figure 4 we can see that most of the registered travelers are infrequent users ofthe car-sharing service: more than 80% of them use the service less than once a week. Therefore, formost travelers, the temporal correlation of trips between consecutive days is weak, and our choice oftime step size (1 day) is acceptable.Next, we briefly introduce the coupon distribution mechanism adopted by this car-sharing serviceduring the period from April 2017 to June 2017. Recall that we only model the impact of coupons ingeneral forms in previous sections; however, the specific coupon distribution and redemption processcan influence how travelers perceive the existence of coupons and therefore shape the distribution ofthe observed data.First, when the operator wants to give a traveler j a coupon ˜ c , she adds this coupon into thetraveler’s e-wallet at the server side. Later, when the traveler opens or refreshes her app, the appcommunicates with the remote server and updates the local e-wallet record. However, this updateinformation does not pop up automatically on the main screen.Secondly, during the trip requesting stage, the coupon information, such as how many coupons thetraveler has currently, does not show up automatically, although the traveler can check her e-wallet atany time. The traveler is able to redeem a coupon only when she finishes a trip and is in the paymentstage; however, coupon redemption is not made a default option, and to redeem a coupon the travelerhas to open her e-wallet from the payment panel and make an explicit selection. All these interactiondesigns make the coupon redemption process less obvious for the travelers.During the concerned period, all the coupons distributed by the service operator are of the sametype: each coupon can be used freely before expiration and can reduce the trip fare by at most its facevalue. This is in accordance with assumption 1. Moreover, there are only four distinct face values: 5,10, 20, and 30. The valid period of a coupon can be as short as 3 days or as long as 90 days.Finally, the operator does not employ any sophisticated coupon distribution strategy during theconcerned period. The operation team distributes coupons manually and the pattern is highly un-predictable due to a changing budget constraint. Moreover, this coupon distribution process does notdifferentiate between frequent and infrequent travelers. Therefore, the arrival of coupons can be viewedas a random exogenous event and we do not consider it in our model. The raw dataset consists of order and coupon records: each order record consists of details of a tripcompleted via the car-sharing service, and each coupon record includes a coupon’s face value, startdate, expiration date, and corresponding traveler. Table 1 summarizes the data fields used in ourstudy.This raw dataset is processed for the subsequent coupon selection analysis. The processed datasetconsists of three categories of features: trip details, including realized fare p (cid:48) x and redeemed coupon ˜ c ype FieldsOrder order ID, (encrypted) traveler ID, trip start time, trip end time,fare, used coupon ID, paymentCoupon coupon ID, (encrypted) traveler ID, coupon face value, start time, expire time Table 1: Data fields(along with its coupon group c ); coupon details, including available coupon set C and coupon attentionstate I a (which will be discussed in next subsection); and traveler-specific details, including averagedaily trip frequency with the car-sharing service ˆ λ , and average and standard deviation of the logarithmof the trip fares ˆ µ p , ˆ σ p . The latter two are used to construct the empirical marginal distribution of tripfare ˆ P j : log( p (cid:48) x ) ∼ N (ˆ µ pj , ˆ σ pj ). We choose the log-normal distribution because it adequately matchesthe empirical distribution of trip fares, as shown in part (b) of Figure 4 (the fare details are removedfrom this figure per request of the service operator). The available coupon set C for each trip orderis recovered from the coupon records. More specifically, we track the life cycle of each coupon ˜ c andappend it to the set C when it is still alive.Because our major concern in this study is the coupon selection behavior, we remove records inwhich the coupon set is empty ( C = C ). The final dataset on coupon selection contains more than0.36 million records and approximately 0 .
08 million of them have only one coupon in the set C . As areference, the dataset constructed from all available data contains more than 0.9 million records and0.23 million of them have only one coupon in C . In this subsection, we discuss several observations from the coupon selection data and introduce a fewhypotheses for explanations.First, consider the case in which the traveler has only one coupon. On one hand, from equation(14) we know that the optimal value V ∗ is upper bounded by v . So if p (cid:48) x ≥ v and there are temporaldiscounting effects, the expected value from future redemption is strictly smaller than the value ofimmediate redemption, v , and a utility-maximizing traveler should always redeem the coupon. On theother hand, the observed coupon redemption ratio when p (cid:48) x ≥ v is consistently lower than 1, as shownin Figure 5. In Figure 5, we group data with different p (cid:48) x /v with a granularity of 0.2 and each point onthe curve represents the statistics of the data from the group on its left: for example, point at p (cid:48) x /v = 1and v = 5 represents statistics for all data with 0 . ≤ p (cid:48) x /v < , v = 5, and point at p (cid:48) x /v = 1 . v = 10 summarize statistics for all data with 1 ≤ p (cid:48) x /v < . , v = 10, etc. x /v0.00.20.40.60.81.0 r a t i o o f c o u p o n r e d e m p t i o n v = 5v = 10v = 20v = 30 Figure 5: Ratio of coupon redemption v.s. fare-value ratio, cases with only one couponNext, we consider the case in which there is a coupon (cid:104) v, T (cid:105) in the available coupon set C suchthat the trip fare p (cid:48) x satisfies p (cid:48) x ≥ v . By the same argument as above, we expect to observe that thecoupon redemption ratio equals to 1 in this case. However, this is not true for the observed data,15s shown in Figure 6. Figure 6 shows the relationship between the coupon quantity and the couponredemption ratio of the observed data. From part (a), we can see that the coupon redemption ratio isconsistently below 1. Nevertheless, this gap diminishes as the coupon quantity increases. From part(b-c) we further observe that traveler specific factors such as past experiences and trip frequencies alsohave strong impacts on the coupon redemption ratio. r a t i o o f c o u p o n r e d e m p t i o n observationsy = 1 − 0.4/x (a) Overall r a t i o o f c o u p o n r e d e m p t i o n N e = 1N e ∈ [2, 3]N e ∈ [4, 7]N e ∈ [8, 15]N e ∈ [16, 31] (b) By experience r a t i o o f c o u p o n r e d e m p t i o n γ ∈ [e −5 , e −4 )γ ∈ [e −4 , e −3 )γ ∈ [e −3 , e −2 )γ ∈ [e −2 , e −1 )γ ∈ [e −1 , e −0 ) (c) By frequency Figure 6: Ratio of coupon redemption v.s. coupon quantity, cases in which some coupon face values v exceed fare p (cid:48) x Now, we introduce several hypotheses to explain the above observations. We limit our attentionto approaches in two directions: either to relax the optimality assumption or to examine and redefinethe decision problem that the traveler optimizes.We point out that the deviation from optimality in the cases with only one coupon greatly limitsour choices in the first direction. In situations that the traveler has only one coupon, the traveler is con-fronted with much easier decision problems compared with general cases. Models on the computationcomplexity of the decision problem, such as those on bounded rationalities or deliberate attention, thenimply that decisions in the cases with only one coupon are more likely to follow the optimal decision π ∗ . However, above observations show that the deviation is the greatest in this setting.One possible option in this direction is to consider near-optimal stochastic policies, such as theentropy-regularized optimal policy π H [40] π H,c ( c | p (cid:48) x , C ) ∝ exp[ r ( p (cid:48) x , c ) + V π H ( f ( C, c ))] . (20)Clearly stochastic policies such as π H can explain a coupon redemption ratio below 1 when we have p (cid:48) x ≥ v . However, coupon redemption ratios from such policies are sensitive to the coupon value v , atleast in the cases with only one coupon. We can illustrate the intuition of this claim by looking atequation (32): given the same ˆ λ j , ˆ P j and T , one can show by induction on T that the gap between v andˆ V ( v, T ) increases monotonically as v increases. Therefore, the redemption ratio under the condition16 (cid:48) x /v ≥ v . However, this is not fully consistent with ourobservations in Figure 5: the redemption ratio under v = 30 is not significantly greater than the oneunder v = 20.We can follow the discussion above and keep searching for the “correct” policy; however, in thisprocess we need to make more assumptions and the generalization power of our results diminishes.Next, we consider approaches in the second direction: the decision problem faced by the traveler isdifferent from the one in our model. For our specific problem of coupon selections, we further assumethat such difference comes from the knowledge of the available coupon set C . As discussed in previoussections, the operator does not provide much information about the arrival of coupons, so there arepossibilities that the traveler has only partial knowledge of C .To identify variables that capture the traveler’s knowledge of C , we recall that a traveler canbe aware of a coupon ˜ c only when she opens her e-wallet during the coupon’s life cycle. Since onemotivation of such action is to use a coupon for payment, we consider the following variable: “activationrecord” I a (˜ c ) ∈ { , } of coupon ˜ c , which equals to 1 if and only if there is a past realized trip, afterwhich the traveler selected a coupon ˜ c (cid:48) (cid:54) = ˜ c for payment and coupon ˜ c was in the e-wallet at that time.We now examine explanation power of I a on the aforementioned deviations. Figure 7 shows therelationship between p (cid:48) x /v and the coupon redemption ratio in cases that the traveler has only onecoupon ˜ c and the coupon satisfies I a (˜ c ) = 1. Figure 8 shows the relationship between the couponquantity and the coupon redemption ratio in cases that there is a coupon ˜ c = (cid:104) v, T (cid:105) in the availablecoupon set C such that the trip fare p (cid:48) x satisfies p (cid:48) x ≥ v and I a (˜ c ) = 1. From both figures, we cansee that the coupon redemption ratio becomes much closer to 1 when we impose the restriction on I a . Moreover, part (b-c) of Figure 8 indicates that the impact of past experience and trip frequencydiminishes under the same restriction. x /v0.00.20.40.60.81.0 r a t i o o f c o u p o n r e d e m p t i o n v = 5v = 10v = 20v = 30 Figure 7: Ratio of coupon redemption v.s. fare-value ratio, cases with only one coupon, I a (˜ c ) = 117 r a t i o o f c o u p o n r e d e m p t i o n observationsy = 1 − 0.15/x (a) Overall r a t i o o f c o u p o n r e d e m p t i o n N e ∈ [2, 3]N e ∈ [4, 7]N e ∈ [8, 15]N e ∈ [16, 31] (b) By experience r a t i o o f c o u p o n r e d e m p t i o n γ ∈ [e −5 , e −4 )γ ∈ [e −4 , e −3 )γ ∈ [e −3 , e −2 )γ ∈ [e −2 , e −1 )γ ∈ [e −1 , e −0 ) (c) By frequency Figure 8: Ratio of coupon redemption, cases in which some coupon face values v exceed fare p (cid:48) x and I a (˜ c ) = 1 In this section, we formalize the notion of unawareness. The basic idea is that travelers, influenced byexternal circumstances and internal limited memories, may make decisions with respect to a perceived(awareness) subset C a rather than the whole coupon set C . If a traveler is only aware of the subset C a and make decisions with respect to it, it is unlikely thatthe traveler concerns about the unawareness itself. This suggests the following assumption: Assumption 11.
The traveler does not consider her unawareness of the set of available coupons inevaluating future coupon redemption values. Therefore, This evaluation is the same as in the baselinemodel.
With this assumption, in following discussions we can consider the same policies π xj and π c asthose developed in Section 3.Next, we discuss the impact of unawareness on decisions under the setting in which traveler j hasan available coupon set C . Because unawarenesses in different periods are correlated, we introduce anew state variable S a for it. After each trip with the target mobility service, this state S a is updatedwith the available coupon set C and the selected coupon group c , according to a function f a : S (cid:48) a = f a ( S a , C, c ) . (21)At the beginning of a trip, the awareness set C a is generated according to the state of attention S a and a distribution P a : P ( C a ) = P a ( C a | C, S a ). Given X and C a , the traveler selects a travel mode18ccording to π xj : P ( i = 1) = π xj ( p x , u xj , C a ) . (22)If the traveler completes the trip via the mobility service, detail of the realized trip X (cid:48) is revealedaccording to X (cid:48) ∼ P (cid:48) j ( ·| X ) and the traveler selects a coupon for redemption given the realized trip fare p (cid:48) x . However, with unawareness, the coupon selection behavior is complicated and can be describedin two steps. First, the traveler makes a choice c with respect to the awareness set C a according tothe probability π c ( c | p (cid:48) x , C a ). Then, if there is no coupon redemption ( c = c ), the traveler proceeds topayment; otherwise ( c (cid:54) = c ), the traveler must open her e-wallet. In this case, she will find that shehas the whole coupon set C ; we call this situation attention recovery . She then re-evaluates and makesher final decision with respect to set C/C .Here, the default option c is removed from the final consideration set for regularity. In fact, ifthe traveler is aware of all coupons C a = C , the re-evaluation should have no effect on the couponselection probabilities. However, if c is included in the final consideration set, the probability ofhaving no coupon redemption increases and differs from π c ( c | p (cid:48) x , C ).In summary, the probability of selecting coupon group c is P ( c | p (cid:48) x , C a ) = (cid:40) π c ( c | p (cid:48) x , C a ) c = c π c ( c | p (cid:48) x , C/C )(1 − π c ( c | p (cid:48) x , C a )) otherwise (23)Now, if the traveler follows the utility-maximizing policy π ∗ , the coupon selection probability fromthe perspective of the observer can be expressed as P ( c | p (cid:48) x , C, S a ) = (cid:88) C a ∈A ( C ) P a ( C a | C, S a ) P ( c | p (cid:48) x , C a ) P ( c | p (cid:48) x , C a ) = E V | D I ( c = arg max c ∈ C a { r ( p (cid:48) x , c ) + V ( f ( C a , c )) } ) P ( c | p (cid:48) x , C a ) = E V | D I ( c (cid:54) = arg max c ∈ C a { r ( p (cid:48) x , c ) + V ( f ( C a , c )) } ) · I ( c = arg max c ∈ C/C { r ( p (cid:48) x , c ) + V ( f ( C, c )) } ) . (24) In this subsection, we specify the form of the state of attention S a and the awareness set probabilityfunction P a .First, from the state transition of S a we see that S a can be interpreted as a function of coupon set C . However, this function space is too large to be considered practically: we need to consider everypossible correlation among coupon groups. Here, we restrict our attention to the coupon-group-levelfeatures. That is, we can interpret S a as a function from coupon groups c to features S a ( c ) relatedonly to that coupon group.Next, we specify the awareness set probability P a ( C a | C, S a ). We first look at the special case thatthere is only one available coupon C = { c, c } , with c = (cid:104) v, T, (cid:105) . In this case, the only differencebetween the two possible outcomes, C and C , is whether the traveler notices the coupon ˜ c = (cid:104) v, T (cid:105) .Thus we can say P a ( C | C, S a ) = σ ( h ( S a ( c ))) , P a ( C | C, S a ) = 1 − σ ( h ( S a ( c ))) , (25)where σ is the sigmoid function σ ( x ) = 1 / (1 + e − x ) and h is a function. A natural choice of h is thelinear functions h ( x ) = θ T x + b , where θ, b are parameters.For the form of P a ( C a | C, S a ) in general, there are much more choices: in Subsection 2.2 we havementioned several of them. Here, we adopt the Manski model [27] which assumes that awarenesses ofdifferent elements in the set are independent of each other. Nevertheless, the word “element” is stillambiguous: we can consider independence at either the coupon-level or the coupon-group-level. We19ow make a detailed discussion on this choice. For coupon-group-level independence, the awarenessset C a can only be a subset of CP a ( C a | C, S a ) = (cid:89) c ∈ C a σ ( h ( S a ( c ))) · (cid:89) c ∈ C/C a (1 − σ ( h ( S a ( c )))); (26)while for the coupon-level independence, the coupon groups in C a can be different from those in C and the quantity in each coupon group is important. To further explain this, let us abuse thenotation of set and consider C = { c , · · · , c m } and C a = { c ai | i ∈ I a } , with c i = (cid:104) v i , T i , n i (cid:105) and c ai = (cid:104) v i , T i , n ai (cid:105) , ≤ n ai ≤ n i . (We say this is an abuse of notation since if n ai = 0, c ai is not an elementof C a .) Now the coupon-level independence leads to P a ( C a | C, S a ) = m (cid:89) i =1 (cid:18) n i n ai (cid:19) [ σ ( h ( S a ( c i )))] n ai [(1 − σ ( h ( S a ( c i ))))] n i − n ai . (27)At first sight, the coupon-group-level-independence formulation seems to be more straightforward.However, for it to make any sense, the quantity of coupon group n must become one of the features in S a and has direct impact on the awareness level h ( S a ( c )); otherwise, the awareness level remains the sameeven when we increase n towards infinity. But now the specification of the relationship between n and h ( S a ( c )) becomes another problem. To avoid this problem, we choose the coupon-level-independenceformulation instead, which provide a simple characterization of the impacts of n i .At the end of this subsection, we discuss which feature to be included in S a ( c ). A natural selectionis the variable I a (˜ c ) which represents whether the traveler has previously seen coupon ˜ c . This functioncan be defined at the coupon-group level because coupons in the same group are distributed at thesame time and should be seen at once. However, from Figure 7 and 8 it is shown that the feature I a (˜ c ) cannot fully capture the inattention effect. Therefore, we add a new parameter θ a to capture theremaining possibilities, such as that the traveler may forget the existence of coupons as time passes: S a ( c ) = I a ( c ) ,h ( S a ( c )) = θ a + θ as I a ( c ) , (28)where θ a , θ as are parameters.Finally, for completeness, we provide the formulation of the state transition function f a f a ( S a , C, c )( f c ( c (cid:48) )) = (cid:40) c = c & f c ( c (cid:48) ) (cid:54) = c & S a ( c (cid:48) ) = 01 otherwise , ∀ c (cid:48) ∈ C/ { c } ∪ {(cid:104) v, T, n − (cid:105)} . (29) In this section, we focus on the model estimations given travelers’ coupon selection behavior dataset { ( c , p (cid:48) x , C , S a , ˆ λ , ˆ µ p , ˆ σ p ) , · · · , ( c N , p (cid:48) xN , C N , S aN , ˆ λ N , ˆ µ pN , ˆ σ pN ) } . As mentioned in Section 4, wefocus on the data obtained during the period from April 2017 to June 2017. Estimations on the wholedataset are provided in Appendix D for reference.Recall from Equation (24) that the coupon selection probability P ( c | p (cid:48) x , C, S a ) is a mixture P ( c | p (cid:48) x , C, S a ) = (cid:88) C a ∈A ( C ) P a ( C a | C, S a ) P ( c | p (cid:48) x , C a ) , (30)where the awareness set probability P a ( C a | C, S a ) is specified in Equations (27) and (28). If we arefurther given a parametric form of P ( c | p (cid:48) x , C a ), we can estimate parameters θ in both models bymaximizing the (average) log-likelihood of the model on the dataset:Log-Likelihood = 1 N N (cid:88) l =1 log P ( c l | p (cid:48) xl , C l , S al ) . (31)20oncerning about the size of our dataset and the complexity of our models, we use TensorFlowto construct the computation graph of these models and the Adam algorithm [24] to optimize thelog-likelihood. Other hyper-parameters for training are summarized in Table 2. In the estimation, themaximal training time for a model is less than an hour. Parameter Value learning rate 0.001mini-batch size 256training epochs 50
Table 2: Hyper-parameters for trainingNext, we discuss estimation results of various forms of P ( c | p (cid:48) x , C a ). We start with estimations incases with only one coupon. When there is only one coupon, the available coupon set C can be expressed as {(cid:104) v, T, (cid:105) , c } , and wecan simplify the notation of coupon c = (cid:104) v, T, (cid:105) with (cid:104) v, T (cid:105) and that of value function V ( {(cid:104) v, T, (cid:105) , c } )with V ( v, T ). The value function ˆ V in equation (16) can now be computed in a simpler wayˆ V ( v, T ) = ˆ V ( v, T −
1) + ˆ λ j E p (cid:48) x | ˆ P j max { min( p (cid:48) x , v ) − ˆ V ( v, T − , } ˆ V ( v, T ) = 0 , ∀ T < or v = 0 . (32)Since in the cases with only one coupon the awareness set C a can only be either C or C , weremain to specify the form of P ( c | p (cid:48) x , C ). With some calculations we can show that P ( c | p (cid:48) x , C ) = E V | D I (min( p (cid:48) x , v ) ≥ V ( v, T − v, T and p (cid:48) x , we can denote itas P ( v, T, p (cid:48) x ) for simplicity.In specifying P ( v, T, p (cid:48) x ), we use approximation ˆ V in Equation (32) as a feature for the optimal valuefunction V . In particular, we consider an estimation with its error following the logistic distribution V ( v, T ) = θ V ˆ V ( v, T ) + ε V , ε V ∼ Logistic (0 , /θ ε ). With this specification, we have P ( v, T, p (cid:48) x ) = σ ( θ ε [min( p (cid:48) x , v ) − θ V ˆ V ( v, T − , (33)where σ is the sigmoid function. One can notice that this form of selection probabilities resembles theone from the “near-optimal stochastic policies” in equation (20). This specification is called the “basicspecification” in the following discussion.The basic specification can be extended in several directions. First, in the basic specificationwe assume that the error ε V is independent from the face value v . Therefore, we have the sameestimation variance in the value of a coupon with v = 30 as in the value of a coupon with v = 5.We can introduce the correlation between ε V and v by scaling ε V with v : V ( v, T ) = θ V ˆ V ( v, T ) + vε V , ε V ∼ Logistic (0 , /θ ε ). This extension is called “scaled” and leads to the following couponselection probability P ( v, T, p (cid:48) x ) = σ ( θ ε [min( p (cid:48) x , v ) /v − θ V ˆ V ( v, T − /v ]) . (34)Second, when the traveler exhibits bounded rationalities, ˆ V may not provide an accurate estimation.Therefore, we include the face value v as an extra feature in our estimation: V ( v, T ) = θ V ˆ V ( v, T ) + θ v v + ε V , ε V ∼ Logistic (0 , /θ ε ). This extension is called “extra” and leads to the following couponselection probability P ( v, T, p (cid:48) x ) = σ ( θ ε [min( p (cid:48) x , v ) − θ V ˆ V ( v, T − − θ v v ]) . (35)21inally, as discussed in Section 3, the true optimal value V ∗ is always bounded by finite values.Therefore, we consider to regularize the estimate by clipping: V ( v, T ) = max { , min { v, θ V ˆ V ( v, T ) + ε V }} , ε V ∼ Logistic (0 , /θ ε ). This extension is called “clip” and leads to the following coupon selectionprobability P ( v, T, p (cid:48) x ) = (cid:40) σ ( θ ε [min( p (cid:48) x , v ) − θ V ˆ V ( v, T − p (cid:48) x < v, . (36)These extensions are orthogonal and can be combined with one another. In total, we can obtaineight different models. Table 3 summarizes the estimated parameters and the performance of thesemodels. In this table, the numbers are reported up to three digits after the decimal point, and 0/1are used to denote False/True and indicate whether a condition is met. “LL” refers to the (average)log-likelihood. “Accuracy” refers to the forecasting accuracy and is computed asAccuracy = 1 N N (cid:88) l =1 I ( c l = arg max c ∈ C l P ( c | p (cid:48) xl , C l , S al )) , (37)“MS” refers to the predicted aggregate coupon redemption ratio (the “market share” of coupon re-demption) and is computed as MS = 1 − N N (cid:88) l =1 P ( c | p (cid:48) xl , C l , S al ); (38)for Table 3, the observed aggregate coupon redemption ratio is 0.719. Inattention Utility Model and Extensions EvaluationUnaware? θ a θ as Clip? Extra? Scaled? θ ε θ V θ v LL Accuracy MS0 N/A N/A 0 0 0 0.088 0.620 N/A -0.543 0.758 0.7190 N/A N/A 0 1 0 0.177 0.128 0.532 -0.513 0.781 0.7010 N/A N/A 0 0 1 1.926 0.513 N/A -0.535 0.751 0.7460 N/A N/A 0 1 1 3.455 0.154 0.480 -0.508 0.782 0.7211 1.025 1.850 0 0 0 0.369 0.743 N/A -0.491 0.780 0.7141 1.183 2.116 0 1 0 0.358 0.439 0.274 -0.479 0.788 0.7071 1.205 3.868 0 0 1 4.272 0.652 N/A -0.491 0.777 0.7301 1.368 4.741 0 1 1 5.060 0.387 0.281 -0.478 0.787 0.7201 1.047 1.450 1 0 0 0.224 0.826 N/A -0.479 0.785 0.7191 1.096 1.427 1 1 0 0.283 0.536 0.228 -0.474 0.789 0.7131 1.080 1.449 1 0 1 4.113 0.797 N/A -0.480 0.787 0.7231 1.122 1.436 1 1 1 4.860 0.568 0.188 -0.476 0.790 0.718
Table 3: Estimated parameters and performance in the case with only one coupon
Coupon face value 5 10 20 30Occurrence 2032 3319 50999 21425
Table 4: Occurrence of records with specific face value of the coupon, in the case with only one couponDirect estimation of models on the raw dataset can lead to bias, because our dataset is unbalancedwith respect to the coupon face value v , as shown in Table 4. As v is an exogenous variable andthe model should not be biased toward any of its specific values, we use weights to rebalance theimportance of each data record. In particular, we give weight w = N/N ( v ) to records with couponface value v , where N ( v ) is the occurrence of records with their coupon face values equaling to v . The22eighted log-likelihood, accuracy, and aggregate coupon redemption ratio are respectively given asWeighted Log-Likelihood = N (cid:88) l =1 N ( v l ) log P ( c l | p (cid:48) xl , C l , S al ) , Weighted Accuracy = N (cid:88) l =1 N ( v l ) I ( c l = arg max c ∈ C l P ( c | p (cid:48) xl , C l , S al )) , Weighted MS = N (cid:88) l =1 N ( v l ) (1 − P ( c | p (cid:48) xl , C l , S al )) , (39)where v l is the face value of the only coupon in C l . Table 5 summarizes the estimation results withweighting. For this table, the weighted observed aggregate coupon redemption ratio is 0.715. Inattention Utility Model and Extensions EvaluationUnaware? θ a θ as Clip? Extra? Scaled? θ ε θ V θ v LL Accuracy MS0 N/A N/A 0 0 0 0.094 0.494 N/A -0.588 0.739 0.6800 N/A N/A 0 1 0 0.196 0.034 0.576 -0.558 0.761 0.6720 N/A N/A 0 0 1 1.902 0.454 N/A -0.555 0.735 0.7440 N/A N/A 0 1 1 2.654 0.195 0.376 -0.541 0.759 0.7291 0.711 1.632 0 0 0 0.973 0.679 N/A -0.515 0.761 0.7141 0.804 1.766 0 1 0 0.780 0.442 0.212 -0.508 0.766 0.7061 0.888 2.814 0 0 1 5.557 0.662 N/A -0.495 0.766 0.7311 0.983 3.134 0 1 1 6.415 0.396 0.291 -0.487 0.772 0.7241 0.711 1.594 1 0 0 0.267 0.810 N/A -0.493 0.770 0.7261 0.736 1.584 1 1 0 0.311 0.566 0.194 -0.491 0.773 0.7231 0.743 1.591 1 0 1 3.991 0.775 N/A -0.496 0.773 0.7281 0.760 1.587 1 1 1 4.505 0.602 0.148 -0.494 0.773 0.726
Table 5: Estimated parameters and performance in the case with only one coupon and reweightingAccording to Tables 3 and 5, we have several findings:1. In general, models with inattention have much better log-likelihood than their counterparts.They also have slightly better prediction accuracies.2. Regularizing value function V by clipping does not lead to significant improvement in the log-likelihood or the accuracy. However, the estimated parameters of the inattention model are morestable across different specifications on the utility model.3. Including extra feature v in the value estimation leads to improvements in the log-likelihood andthe accuracy, but the improvement diminishes as the model becomes more sophisticated. Whenboth the inattention mechanism and the value function regularization by clipping are presentedin the model, the improvement is almost insignificant.4. Contrary to our intuition, introducing scaling in the error structure does not lead any significantimprovement. Rather, it leads to instability in parameter estimations.5. For all models, the estimated θ V is smaller than 1. This outcome can possibly be explained bythe existence of complementary behavioral mechanisms, such as temporal discounting.6. The forecasted aggregate coupon redemption ratio is more stable and closer to the observed oneunder the unawareness models. However, the difference between models is not very large.Next, we examine the estimation results in general cases to determine whether the above findingscan be generalized. 23 .2 General cases The coupon selection probability on awareness set in general cases is given in Equation (24) as P ( c | p (cid:48) x , C a ) = E V | D I ( c = arg max c ∈ C a { r ( p (cid:48) x , c ) + V ( f ( C a , c )) } ) ,P ( c | p x , C a ) = E V | D I ( c (cid:54) = arg max c ∈ C a { r ( p (cid:48) x , c ) + V ( f ( C a , c )) } ) · I ( c = arg max c ∈ C/C { r ( p (cid:48) x , c ) + V ( f ( C, c )) } ) ≈ E V | D I ( c (cid:54) = arg max c ∈ C a { r ( p (cid:48) x , c ) + V ( f ( C a , c )) } ) · E V | D I ( c = arg max c ∈ C/C { r ( p (cid:48) x , c ) + V ( f ( C, c )) } ) . (40)Again, we want to find specifications of V with the approximation ˆ V from equation (16) and anerror term ε V such that the above probabilities have tractable forms. One straightforward solution isto consider ε V ( C ) = ˜ ε V ( C ) − ˜ ε V ( C ) , ˜ ε V ∼ Gumbel (0 , /θ ε ) i.i.d. ; V ( C ) = θ V ˆ V ( C ) + ε V ( C ) , (41)where we subtract ˜ ε V ( C ) for regularity: V ( C ) ≡
0. Notice that this specification is consistentwith the one in the case with only one coupon. Now, the coupon selection probability follows themultinomial logit model E V | D I ( c = arg max c ∈ C { r ( p (cid:48) x , c ) + V ( f ( C, c )) } ) ∝ exp( θ ε [ r ( p (cid:48) x , c ) + θ V ˆ V ( f ( C, c ))]) . (42)Next, we examine the possible extensions of this basic specification. Among the three possibilitiesdiscussed in cases with only one coupon, only the one to include face value v as an extra feature forestimation V can be extended directly. Scaling ε V ( f ( C, c )) with face value v is trivial in implementationbut lacks justification in intuition because errors ε V for different coupon groups c need to be scaleddifferently and the overall error structure is broken. Nevertheless, we include it here for comparison.Clipping the estimation V is difficult to implement because with value clipping the coupon selectionprobability does not have an analytic form. Moreover, direct computation by sampling methods isintractable given the magnitude of our dataset. In this study, we implement an approximation ofclipping, which is consistent with the one in cases with only one coupon: clipping only affects theprobability of choosing no coupon redemption c . In particular, if there exists a coupon group c withface value v ≤ p (cid:48) x , we remove c from the consideration set: P ( c | p (cid:48) x , C ) ∝ (cid:40) c = c & ∃(cid:104) v, T, n (cid:105) ∈ C such that p (cid:48) x ≥ v > , exp( r ( p (cid:48) x , c ) + ˆ V ( f ( C, c ))) otherwise . (43)In addition to these three extensions, we consider a nontrivial modification of the error structure.In the current model, coupons that are very similar but not exactly the same are allocated to differ-ent groups. Therefore, behavior given coupon set {(cid:104) v, T, (cid:105) , (cid:104) v + (cid:15), T, (cid:105) , c } will change abruptly as (cid:15) →
0. One way to restore the continuity in behavior is to view every coupon as a unique one andconsider independent coupon-level estimation error. That is, for any two coupon ˜ c and ˜ c in the samegroup c , the errors in the estimations V ( f ( C, c )) are independent. This modification leads to anothermultinomial logit model E V | D I ( c = arg max c ∈ C { r ( p (cid:48) x , c ) + V ( f ( C, c )) } ) ∝ n · exp( θ ε [ r ( p (cid:48) x , c ) + θ V ˆ V ( f ( C, c ))]) , (44)where n is the number of coupons in group c . This extension is called “iid”.Table 6 summarizes the estimated parameters and the performance of models with different com-binations of the extensions above. As the computational complexity of the inattention model scales24ccording to the size of A ( C ), we limit our attention on records where |A ( C ) | ≤
64. Such recordsconstitute more than 90% of the records in the whole dataset. As a comparison, we also summarizethe estimation results on the dataset where each record has |A ( C ) | ≤
16 in Table 7; by calculating thedifference between these two tables we can evaluate the stability in the estimated parameters. Theobserved aggregate coupon redemption ratio for Tables 6 and 7 are 0.803 and 0.786, respectively.
Inattention Utility Model and Extensions EvaluationUnaware? θ a θ as Clip? Extra? Scaled? iid? θ ε θ V θ v LL Accuracy MS0 N/A N/A 0 0 0 0 0.126 0.873 N/A -0.823 0.747 0.8080 N/A N/A 0 1 0 0 0.222 0.435 0.411 -0.797 0.753 0.7950 N/A N/A 0 0 1 0 2.423 0.800 N/A -0.865 0.673 0.8160 N/A N/A 0 1 1 0 3.931 0.456 0.380 -0.852 0.674 0.7980 N/A N/A 0 0 0 1 0.134 0.898 N/A -0.889 0.662 0.8500 N/A N/A 0 1 0 1 0.226 0.482 0.361 -0.866 0.671 0.8460 N/A N/A 0 0 1 1 2.333 0.995 N/A -0.931 0.614 0.8200 N/A N/A 0 1 1 1 3.902 0.555 0.396 -0.918 0.620 0.8031 -0.289 2.689 0 0 0 0 0.332 0.722 N/A -0.713 0.705 0.7741 -0.268 2.844 0 1 0 0 0.422 0.520 0.216 -0.701 0.709 0.7671 -0.098 4.309 0 0 1 0 4.810 0.714 N/A -0.808 0.628 0.7841 -0.050 5.028 0 1 1 0 5.314 0.620 0.118 -0.806 0.629 0.7801 -0.393 2.358 0 0 0 1 0.388 0.702 N/A -0.750 0.658 0.7871 -0.374 2.387 0 1 0 1 0.493 0.513 0.185 -0.739 0.662 0.7841 -0.244 3.890 0 0 1 1 5.256 0.780 N/A -0.860 0.587 0.7851 -0.215 4.249 0 1 1 1 5.772 0.688 0.107 -0.859 0.587 0.7821 -0.473 1.625 1 0 0 0 0.367 0.680 N/A -0.727 0.706 0.7681 -0.441 1.642 1 1 0 0 0.490 0.475 0.205 -0.716 0.709 0.7611 -0.425 1.682 1 0 1 0 5.120 0.705 N/A -0.833 0.626 0.7671 -0.430 1.678 1 1 1 0 4.925 0.740 0.039 -0.833 0.627 0.7551 -0.470 1.585 1 0 0 1 0.413 0.680 N/A -0.757 0.660 0.7761 -0.438 1.600 1 1 0 1 0.539 0.488 0.182 -0.746 0.664 0.7721 -0.412 1.674 1 0 1 1 5.238 0.805 N/A -0.873 0.587 0.7691 -0.415 1.671 1 1 1 1 5.110 0.830 0.025 -0.873 0.587 0.762
Table 6: Estimated parameters and performance on subset |A ( C ) | ≤ c , this observation implies that ourinattention model underestimates travelers’ ability to remember. This claim is also supported bythe underestimation of aggregate coupon redemption ratio from unawareness models. Comparedwith results from cases with only one coupon, it seems that the prediction accuracy of ourunawareness model drops as the size of the coupon set C increases. This observation will beuseful when we attempt to improve our model in the future.2. Possibly because of a lack of justification, both scaling ε V with v and the approximated clippingof V lead to poor log-likelihood and low accuracy. We also observe no improvement in thestability of parameter estimation from value clipping.3. The extension with independent coupon-level estimation errors is entirely ineffective. Recall thatwith this modification, we try to solve the discontinuity problem by introducing heterogeneitiesamong coupons in a coupon group. In the future, we can go in the opposite direction and considerhomogeneities between coupons from different coupon groups.25 nattention Utility Model and Extensions EvaluationUnaware? θ a θ as Clip? Extra? θ ε θ V θ v LL Accuracy MS0 N/A N/A 0 0 0.120 0.785 N/A -0.774 0.748 0.8060 N/A N/A 0 1 0.213 0.379 0.423 -0.748 0.756 0.7871 -0.175 2.830 0 0 0.315 0.706 N/A -0.677 0.699 0.7651 -0.143 3.013 0 1 0.398 0.504 0.222 -0.666 0.702 0.7571 -0.364 1.690 1 0 0.343 0.680 N/A -0.690 0.699 0.7571 -0.331 1.701 1 1 0.461 0.471 0.210 -0.681 0.702 0.749
Table 7: Estimated parameters and performance on subset |A ( C ) | ≤ v in the value estimation leads to significant improvements in both thelog-likelihood and the prediction accuracy. This outcome can possibly be explained by the lack ofan accurate inattention model or by travelers’ bounded rationality in facing difficult optimizationproblems.5. The estimate of θ V is again smaller than 1 for all models, suggesting the existence of comple-mentary behavioral mechanisms.6. Finally, estimates in Tables 6 and 7 are close to each other but different from those in Table3. The difference in the parameters of the inattention model is especially large. This findingimplies that, despite consistency in terms of mathematical forms, the models developed in thissubsection are not natural extensions of the models in cases with only one coupon. In this section, we evaluate the impact of unawarenesses on coupons’ promotional effects via simulation.We first show that why a qualitative analysis of such impact is difficult. Recall that the utility gainfrom choosing the mobility service under the optimal mode choice policy π ∗ xj is u xj + E X (cid:48) | X max c ∈ C { r ( p (cid:48) x , c ) + V ∗ ( f ( C, c )) − V ∗ ( f ( C )) } . (45)When the traveler is only aware of a subset C a of available coupons instead of the whole set C , theopportunity cost V ∗ ( f ( C a )) − V ∗ ( f ( C a , c )) of using a coupon ˜ c from group c becomes larger. Accordingto equation (45), the traveler is then less likely to use the mobility service. However, unawarenessesalso reduces the rate of coupon redemption and the impact of the coupons can last longer. Moreover,with the attention recovery in the payment stage, the traveler may select a coupon ˜ c costlier to theoperator than any other in C a . Since all these factors push the promotional effects of coupons indifferent directions, the impact of inattention is hard to understand qualitatively.Next, we simulate the traveler’s decision flow with the models developed in Sections 3 and 5, andsummarize performance metrics including the total trip quantity N trip , the total redeemed couponvalue V redeemed , and the promotional effect ρ = (cid:80) ( N trip − N trip, ) (cid:80) V redeemed , (46)where N trip, is the baseline total trip quantity.In the simulation, we simplify the trip demand generation and the trip realization processes toreduce computational burden, by assuming that λ j ≡
1, log( p (cid:48) x ) ≡ log( p x ) ∼ N ( µ pj , σ pj ), u xj ≡ u ∈ R . Under these assumptions, the optimal mode selection policy dictates P ( x (cid:48) = 1 | C a , λ j , P j ) = I ( u + max c ∈ C a { r ( p x , c ) + V ∗ ( f ( C a , c )) − V ∗ ( f ( C a ) } ) ≥ . (47)26gain, since we do not know V ∗ exactly, we follow the specifications in Section 6 to use ˆ V fromequation (16) as an estimate and assume a logistic estimation error. Now we have P ( x (cid:48) = 1 | C a , λ j , P j ) = σ ( β [ u + max c ∈ C a { r ( p x , c ) + ˆ V ( f ( C a , c )) − ˆ V ( f ( C a )) } ]) , (48)where σ is the sigmoid function and β is a parameter describing the sensitivity of the service selectionprobability to coupon values.For the coupon selection probability P ( c | p x , C a ), we specify the same form of π c ( c | p (cid:48) x , C a ) as thebasic specification in Section 6: π c ( c | p (cid:48) x , C a ) ∝ exp( θ ε [ r ( p (cid:48) x , c ) + θ V ˆ V ( f ( C a , c ))]) . (49)In the experiment, we consider the same setting as in Example 2: fare distribution log( p x ) ∼ N (3 . , . ), and the coupon set C = { c , (cid:104) , , (cid:105) , (cid:104) , , (cid:105) , (cid:104) , , (cid:105) , (cid:104) , , (cid:105)} . For a compre-hensive evaluation, we consider various combinations of the default mode selection rate λ = σ ( βu )and the coupon value sensitivity β . For parameters in the inattention and coupon selection models, wechoose θ as = 1 . , θ a = − . , θ ε = 0 .
5, which are close to the estimations in Section 6. We also assumethat all coupons are activated; that is, S a ( c ) = 1 for all c ∈ C .For evaluation, we simulate each ( λ , β ) case with T max = max c T = 30 steps and for 250,000times. Table 8 presents the simulation results and shows that models with inattention indeed lead tolower promotional effect ρ than their counterparts; in fact, the reduction in ρ can be as great as 10%.In Table 8, the default trip quantity N trip, can be calculated with λ T max . Inattention N trip V redeemed ρ mean std mean std λ = 0 . , β = 0 . N trip, = 1 .
50 1.638 1.231 17.52 13.20 0.00791 1.619 1.223 16.43 12.78 0.0072 λ = 0 . , β = 0 . N trip, = 1 .
50 2.315 1.350 25.27 14.33 0.03221 2.189 1.335 22.91 13.98 0.0301 λ = 0 . , β = 0 . N trip, = 60 6.152 2.166 43.45 11.90 0.00351 6.119 2.172 40.28 12.78 0.0030 λ = 0 . , β = 0 . N trip, = 60 6.718 2.054 47.70 9.83 0.01511 6.588 2.102 44.06 11.41 0.0134 Table 8: Simulation results of coupons’ promotional effects
In this paper, we proposed an inattention mechanism on unawareness to explain the observed deviationof traveler coupon redemption behavior from utility-maximization and estimation results in Section6 shows that our model indeed leads to a better fit of the dataset compared with baseline models.Moreover, our simulation experiment in Section 7 shows that if such unawareness exists, it can lead toa considerable reduction in the promotional effects of coupons. Therefore, a service operator should beaware of travelers’ unawarenesses and take necessary actions. For example, when distributing coupons,the operator should send notifications to travelers to ensure that they are properly incentivized. More-over, when a traveler’s forgetfulness is unavoidable, the operator should include the probability ofcoupon unawareness in the design of coupon distribution strategies.The model developed in this study has several limitations worthy of further exploration. First, wedid not obtain consistent parameter estimates of the inattention model between the case with only27ne coupon and general cases. We speculate that the independent coupon-level inattention mechanismemployed here is inappropriate. In addition, our consideration of attention state S a and transition f a is restricted to the first activation event I a . In the future, we can include more information in S a , suchas the time from the most recent activation of each coupon, and consider more complicated transitiondynamics f a , such as the Hawkes process or even recurrent neural networks.Second, we focused on the impact of unawareness in this study, but travelers may also exhibitdeliberate attention. In fact, our model of coupon groups dictates a nested consideration structuresimilar to the one in the classic nested logit model [7]. However, as mentioned in Section 6, this nestedstructure violates regularities because it imposes strong correlations among coupons in the same groupbut requires independence of coupons from different groups even when these coupons are very similar.In the future, we can employ existing works on consideration sets to develop models that are effectivein capturing travelers’ perceived homogeneity among coupons.Third, given the limitations in computational power, we failed to extend the value clipping regu-larization to general cases, and our simple approximation was shown to be ineffective. Further workin developing tractable estimation algorithms of this model is needed.Finally, our estimation results show that even after we take the impact of unawareness into account,traveler behavior still deviates from utility-maximization decisions. However, it is questionable whetherthere are alternative decision mechanisms that are both theoretically justifiable and computationallytractable. Acknowledgement
The author thanks Shenhao Wang and Xiang Song for their insightful comments.
References [1] Jason Abaluck and Abi Adams. Discrete choice models with consideration sets: Identificationfrom asymmetric cross-derivatives, 2016.[2] Pieter Abbeel and Andrew Y Ng. Apprenticeship learning via inverse reinforcement learning. In
Proceedings of the twenty-first international conference on Machine learning , page 1. ACM, 2004.[3] Victor Aguirregabiria and Pedro Mira. Dynamic discrete choice structural models: A survey.
Journal of Econometrics , 156(1):38–67, 2010.[4] G¨ozen Ba¸sar and Chandra Bhat. A parameterized consideration set model for airport choice:an application to the san francisco bay area.
Transportation Research Part B: Methodological ,38(10):889–904, 2004.[5] Kapil Bawa, Srini S Srinivasan, and Rajendra K Srivastava. Coupon attractiveness and couponproneness: A framework for modeling coupon redemption.
Journal of Marketing Research , pages517–525, 1997.[6] Moshe Ben-Akiva and Bruno Boccara. Discrete choice models with latent choice sets.
Internationaljournal of Research in Marketing , 12(1):9–24, 1995.[7] Moshe E. Ben-Akiva and Steven R. Lerman.
Discrete choice analysis: theory and application totravel demand , volume 9. MIT press, 1985.[8] Robert Blattberg, Thomas Buesing, Peter Peacock, and Subrata Sen. Identifying the deal pronesegment. In
Perspectives On Promotion And Database Marketing: The Collected Works of RobertC Blattberg , pages 79–87. World Scientific, 2010.289] M Keith Chen and Michael Sheldon. Dynamic pricing in a labor market: Surge pricing and flexiblework on the uber platform. In EC , page 455, 2016.[10] Jeongwen Chiang, Siddhartha Chib, and Chakravarthi Narasimhan. Markov chain monte carloand models of consideration set and parameter heterogeneity. Journal of Econometrics , 89(1-2):223–248, 1998.[11] Chelsea Finn, Paul Christiano, Pieter Abbeel, and Sergey Levine. A connection between generativeadversarial networks, inverse reinforcement learning, and energy-based models. arXiv preprintarXiv:1611.03852 , 2016.[12] Justin Fu, Katie Luo, and Sergey Levine. Learning robust rewards with adversarial inverse rein-forcement learning. arXiv preprint arXiv:1710.11248 , 2017.[13] Xavier Gabaix. A sparsity-based model of bounded rationality.
The Quarterly Journal of Eco-nomics , 129(4):1661–1710, 2014.[14] Michelle Sovinsky Goeree. Limited information and advertising in the us personal computerindustry.
Econometrica , 76(5):1017–1074, 2008.[15] F¨usun G¨on¨ul and Kannan Srinivasan. Estimating the impact of consumer expectations of couponson purchase behavior: A dynamic structural model.
Marketing science , 15(3):262–279, 1996.[16] James J Heckman and Salvador Navarro. Dynamic discrete choice and dynamic treatment effects.
Journal of Econometrics , 136(2):341–396, 2007.[17] Jonathan Ho and Stefano Ermon. Generative adversarial imitation learning. In
Advances inNeural Information Processing Systems , pages 4565–4573, 2016.[18] Elisabeth Honka. Quantifying search and switching costs in the us auto insurance industry.
TheRAND Journal of Economics , 45(4):847–884, 2014.[19] Elisabeth Honka, Ali Horta¸csu, and Maria Ana Vitorino. Advertising, consumer awareness, andchoice: Evidence from the us banking industry.
The RAND Journal of Economics , 48(3):611–646,2017.[20] V Joseph Hotz and Robert A Miller. Conditional choice probabilities and the estimation ofdynamic models.
The Review of Economic Studies , 60(3):497–529, 1993.[21] J Jeffrey Inman and Leigh McAlister. Do coupon expiration dates affect consumer behavior?
Journal of Marketing Research , pages 423–428, 1994.[22] Sudarsan Jayasingh and Uchenna Cyril Eze. An empirical analysis of consumer behaviouralintention towards mobile coupons in malaysia.
International Journal of Business and Information ,4(2), 2015.[23] Jun B Kim, Paulo Albuquerque, and Bart J Bronnenberg. Online demand under limited consumersearch.
Marketing science , 29(6):1001–1023, 2010.[24] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprintarXiv:1412.6980 , 2014.[25] Donald R Lichtenstein, Richard G Netemeyer, and Scot Burton. Distinguishing coupon pronenessfrom value consciousness: An acquisition-transaction utility theory perspective.
The Journal ofMarketing , pages 54–67, 1990. 2926] Mohamed Salah Mahmoud, Adam Weiss, and Khandker Nurul Habib. Myopic choice or rationaldecision making? an investigation into mode choice preference structures in competitive modalarrangements in a multimodal urban area, the city of toronto.
Canadian Journal of Civil Engi-neering , 43(5):420–428, 2016.[27] Charles F Manski. The structure of random utility models.
Theory and decision , 8(3):229–254,1977.[28] Paola Manzini and Marco Mariotti. Stochastic choice and consideration sets.
Econometrica ,82(3):1153–1176, 2014.[29] Yusufcan Masatlioglu and Daisuke Nakajima. Choice by iterative search.
Theoretical Economics ,8(3):701–728, 2013.[30] Yusufcan Masatlioglu, Daisuke Nakajima, and Erkut Y Ozbay. Revealed attention. In
BehavioralEconomics of Preferences, Choices, and Happiness , pages 495–522. Springer, 2016.[31] Andrew Y Ng, Stuart J Russell, et al. Algorithms for inverse reinforcement learning. In
Icml ,pages 663–670, 2000.[32] Imke Reimers and Claire Xie. Do coupons expand or cannibalize revenue? evidence from ane-market.
Management Science , 2018.[33] John Rust. Optimal replacement of gmc bus engines: An empirical model of harold zurcher.
Econometrica: Journal of the Econometric Society , pages 999–1033, 1987.[34] Stephan Seiler. The impact of search costs on consumer behavior: A dynamic approach.
Quanti-tative Marketing and Economics , 11(2):155–203, 2013.[35] Allan D Shocker, Moshe Ben-Akiva, Bruno Boccara, and Prakash Nedungadi. Consideration setinfluences on consumer decision-making and choice: Issues, models, and suggestions.
Marketingletters , 2(3):181–197, 1991.[36] Christopher A Sims. Implications of rational inattention.
Journal of monetary Economics ,50(3):665–690, 2003.[37] Joffre Swait and Moshe Ben-Akiva. Empirical test of a constrained choice discrete model: modechoice in sao paulo, brazil.
Transportation Research Part B: Methodological , 21(2):103–115, 1987.[38] Ronald W Ward and James E Davis. Coupon redemption.
Journal of Advertising Research ,18(4):51–58, 1978.[39] Dennis J Zhang, Hengchen Dai, Lingxiu Dong, Fangfang Qi, Nannan Zhang, Xiaofei Liu, andZhongyi Liu. How does dynamic pricing affect customer behavior on retailing platforms? evidencefrom a large randomized experiment on alibaba. 2017.[40] Brian D Ziebart, Andrew L Maas, J Andrew Bagnell, and Anind K Dey. Maximum entropyinverse reinforcement learning. In
AAAI , volume 8, pages 1433–1438. Chicago, IL, USA, 2008.30
Appendix
A.1 Value functions in general cases
In general situations, assumption 2 does not hold. In this section, we derive value functions forboth discrete time settings with general time unit t and continuous time setting. We also show theequivalence between the two as the time unit t diminishes to 0.First, consider unit time trip generation rate ¯ λ j : in discrete time settings with a time unit t , thetrip generation probability in each time step is ¯ λ j t , while in continuous time setting, the time gap δ t between consecutive trips follows the exponential distribution δ t ∼ Exp (1 / ¯ λ j ).When assumption 2 does not hold, we need to consider state transition of coupon sets with respectto a variety of time periods. Therefore, we consider the following generalization of coupon statetransition function f : f ( C, c, t ) = f ( f ( C, t ) , c ) f ( C, t ) = { f c ( c ) | c ∈ C } f c ( (cid:104) v, T, n (cid:105) , t ) = (cid:40) (cid:104) v, T − t, n (cid:105) v, n > , T ≥ tc otherwise f ( C, (cid:104) v, T, n (cid:105) ) = (cid:40) ( C/ (cid:104) v, T, n (cid:105) ) ∪ (cid:104) v, T, n − (cid:105) n > C/ (cid:104) v, T, n (cid:105) otherwise (50)Because now a coupon can expire before a trip ends, we include the state transition f c withinthe formulation of individual trips. However, such a transition depends on the realized trip time andultimately the travel mode. Here, we use t x and t (cid:48) x to represent the vector of estimated and realizedtrip times, respectively. Because these new variables make the problem more complicated, we introducethe following assumptions for a simple characterization. Assumption 12. (a) The time discount factor γ equals to 1 (there is no time discount effect).(b) (cid:80) i (cid:54) =1 P ( i ) V ( f ( C, t (cid:48) xi )) = (1 − P (1)) V ( f ( C, ¯ t )) ; ¯ t ∈ R + can be interpreted as the expected triptime from alternative modes. Now, the value functions in the discrete time setting with time unit t can be given as follows. V πt ( C ) = (1 − ¯ λ j t ) V πt ( f ( C, t )) + ¯ λ j t E X { V πt ( f ( C, (cid:100) ¯ t/t (cid:101) · t ))+ π xj ( p x , u xj , C )[ u xj − V πt ( f ( C, (cid:100) ¯ t/t (cid:101) · t )) + E X (cid:48) | X V πc,t ( p (cid:48) x , f ( C, (cid:100) t (cid:48) x /t (cid:101) · t ))] − π xj ( p x , u xj , C ) u xj } ,V πc,t ( p (cid:48) x , C ) = (cid:88) c ∈ C π c ( c | p (cid:48) x , C )[ r ( p (cid:48) x , c ) + V πt ( f ( C, c ))] ,V πt ( C ) = 0 . (51)On the other hand, the value functions in the continuous time setting can be given as follows.¯ V π ( C ) = E δ t | ¯ λ j { E X { ¯ V π ( f ( C, ¯ t + δ t ))+ π xj ( p x , u xj , C )[ u xj − ¯ V π ( f ( C, ¯ t + δ t )) + E X (cid:48) | X ¯ V πc ( p (cid:48) x , f ( C, t (cid:48) x + δ t ))] − π xj ( p x , u xj , C ) u xj }} , ¯ V πc ( p (cid:48) x , C ) = (cid:88) c ∈ C π c ( c | p (cid:48) x , C )[ r ( p (cid:48) x , c ) + ¯ V π ( f ( C, c ))] , ¯ V π ( C ) = 0 . (52)The similarity between the above two equations actually leads to the following equivalence result:31 roposition 3. The value function V πt in the discrete time setting converges to the value function ¯ V π in the continuous time setting as the time step size t diminishes to 0: ¯ V π ( C ) = lim t → V πt ( C ) , ∀ C ∈ C . (53) Proof of Proposition 3.
First, let ¯ Q π ( C, δ t ) be E X { ¯ V π ( f ( C, ¯ t + δ t ))+ π xj ( p x , u xj , C )[ u xj − ¯ V π ( f ( C, ¯ t + δ t )) + E X (cid:48) | X ¯ V πc ( p (cid:48) x , f ( C, t (cid:48) x + δ t ))] − π xj ( p x , u xj , C ) u xj } . (54)Then we have − d { ¯ V π ( f ( C, t )) } dt = − ddt (cid:90) ∞ ¯ λ j e − ¯ λ j δ t ¯ Q π ( C, t + δ t ) dδ t = − ddt (cid:90) ∞ t ¯ λ j e − ¯ λ j ( δ t − t ) ¯ Q π ( C, δ t ) dδ t = ¯ λ j ¯ Q π ( C, t ) − (cid:90) ∞ t ¯ λ j e − ¯ λ j ( δ t − t ) ¯ Q π ( C, δ t ) dδ t = ¯ λ j [ ¯ Q π ( C, t ) − ¯ V π ( f ( C, t ))] . (55)If we let t = 0 in the above equations, we have − d { ¯ V π ( f ( C, t )) } dt | t =0 = ¯ λ j [ ¯ Q π ( C, − ¯ V π ( C )] . (56)Secondly, if we let Q πt ( C, δ t ) be E X { V πt ( f ( C, δ t + (cid:100) ¯ t/t (cid:101) · t ))+ π xj ( p x , u xj , C )[ u xj − V πt ( f ( C, δ t + (cid:100) ¯ t/t (cid:101) · t )) + E X (cid:48) | X V πc,t ( p (cid:48) x , f ( C, δ t + (cid:100) t (cid:48) x /t (cid:101) · t ))] − π xj ( p x , u xj , C ) u xj } , (57)we have − d { lim t → V πt ( f ( C, t )) } dt | t =0 = lim t → V πt ( C ) − V πt ( f ( C, t )) t = lim t → ¯ λ j ( Q πt ( C, − V πt ( f ( C, t )))= ¯ λ j ( lim t → Q πt ( C, − lim t → V πt (( C )) . (58)Becauselim t → Q πt ( C,
0) = E X { lim t → V πt ( f ( C, ¯ t ))+ π xj ( p x , u xj , C )[ u xj − lim t → V πt ( f ( C, ¯ t )) + E X (cid:48) | X lim t → V πc,t ( p (cid:48) x , f ( C, t (cid:48) x ))] − π xj ( p x , u xj , C ) u xj } , (59)we can see that equations (56) and (58) actually refer to the same differential equation. Given boundarycondition lim t → V πt ( C ) = ¯ V π ( C ) = 0, we know the solution to the differential equation is uniqueand therefore lim t → V πt ( C ) = ¯ V π ( C ). 32 .2 Proofs A.2.1 Proof of Corollary 1
Proof of Corollary 1.
By replacing the coupon set C in equations (7), (9) and (10) with the defaultset C , we have U π ( C ) = (1 − λ j ) γU π ( C ) + λ j E X U πxj ( p x , u xj , C ) ,U πxj ( p x , u xj , C ) = (1 − π xj ( p x , u xj , C )) γU π ( C ) + π xj ( p x , u xj , C )[ u xj + E X (cid:48) | X U πc ( p (cid:48) x , C )] + u x ˜1 j ,U πc ( p (cid:48) x , C ) = γU π ( C ) , (60)which we can simplify to U π ( C ) = (1 − λ j ) γU π ( C ) + λ j E X { (1 − π xj ( p x , u xj , C )) γU π ( C ) + π xj ( p x , u xj , C )[ u xj + γU π ( C )] + u x ˜1 j } = γU π ( C ) + λ j E X [ u x ˜1 j + π xj ( p x , u xj , C ) u xj ]= 11 − γ λ j E X [ u x ˜1 j + π xj ( p x , u xj , C ) u xj ] . (61) A.2.2 Proof of Proposition 1
Proof of Proposition 1.
First, it is easy to see that U πc ( p (cid:48) x , C ) = γU π ( f ( C )) = γU π ( C ). Now, bythe definition of V π and V πc , and equations (7), (9) and (10), we have V πc ( p (cid:48) x , C ) = (cid:88) c ∈ C π c ( c | p (cid:48) x , C )[ r ( p (cid:48) x , c ) + γU π ( f ( C, c ))] − γU π ( f ( C ))= (cid:88) c ∈ C π c ( c | p (cid:48) x , C )[ r ( p (cid:48) x , c ) + γV π ( f ( C, c ))] , (62)and V π ( C ) = γ [ U π ( f ( C )) − U π ( f ( C ))] + λ j E X { π xj ( p x , u xj , C )[ u xj + E X (cid:48) | X U πc ( p (cid:48) x , C ) − γU π ( f ( C ))] − π xj ( p x , u xj , C )[ u xj + E X (cid:48) | X U πc ( p (cid:48) x , C ) − γU π ( f ( C ))] } = γV π ( f ( C )) + λ j E X { π xj ( p x , u xj , C )[ u xj + E X (cid:48) | X V πc ( p (cid:48) x , C ) − γV π ( f ( C ))] − π xj ( p x , u xj , C ) u xj } . (63) A.2.3 Proof of Proposition 2Lemma 1. If x ≥ y , then we have I ( x ≥ x − y ) ≥ max( x, − max( y, ≥ I ( y ≥ x − y ) , (64) where I ( · ) is the indicator function.Proof of Lemma 1. Since x ≥ y , we have y ≥ ⇒ x ≥
0, and x < ⇒ y <
0. Thereforemax( x, − max( y,
0) = I ( y ≥ x, − max( y, I ( y < x, − max( y, I ( y ≥ x − y ) + I ( g ( x ) <
0) max( x, ≥ I ( y ≥ x − y ) . (65)33imilarly, we havemax( x, − max( y,
0) = I ( x ≥ x, − max( y, I ( x < x, − max( y, I ( x ≥ x − max( y, ≤ I ( x ≥ x − y ) . (66) Proof of Proposition 2.
First, recall that in the coupon selection stage, default action c is alwaysavailable. Therefore the ex post value function V ∗ c ( p (cid:48) x , C ) has a lower bound V ∗ c ( p (cid:48) x , C ) ≥ r ( p (cid:48) x , c ) + V ∗ ( f ( C, c )) = V ∗ ( f ( C )) . (67)This further leads to u xj + E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) ≥ u xj . (68)Now, applying Lemma 1 with x = u xj + E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) and y = u xj , we havemax( u xj + E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) , − max( u xj , ≥ I ( u xj ≥ E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C ))] , max( u xj + E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) , − max( u xj , ≤ I ( u xj + E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) ≥ · [ E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C ))] . (69)Putting above inequalities back to the ex ante value function in equation (14), we have V ∗ ( C ) ≥ V ∗ ( f ( C )) + λ j E X I ( u xj ≥ E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C ))] ,V ∗ ( C ) ≤ V ∗ ( f ( C )) + λ j E X I ( u xj + E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) ≥ E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C ))] . (70)Since λ j E X I ( u xj + E X (cid:48) | X V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C )) ≥
0) is exactly the service selection probability P ( x (cid:48) = 1 | C, λ j , P j ), the equations above can be further simplified to V ∗ ( C ) ≥ V ∗ ( f ( C )) + E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 1)[ V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C ))] ,V ∗ ( C ) ≤ V ∗ ( f ( C )) + E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 1)[ V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C ))] , (71)Notice that in equations above we hide the fact that x (cid:48) and p (cid:48) x are conditional on λ j , P j and P (cid:48) j .Now, given boundary condition V ∗ ( C ) = V L ( C ), we can show by the induction principle that forevery C ∈ C V ∗ ( C ) ≥ V ∗ ( f ( C )) + E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 1)[ V ∗ c ( p (cid:48) x , C ) − V ∗ ( f ( C ))]= E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 0) V ∗ ( f ( C )) + E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 1) V ∗ c ( p (cid:48) x , C ) ≥ E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 0) V L ( f ( C )) + E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 1) V Lc ( p (cid:48) x , C )= V L ( f ( C )) + E x (cid:48) ,p (cid:48) x | C I ( x (cid:48) = 1)[ V Lc ( p (cid:48) x , C ) − V L ( f ( C ))]= V L ( C ) . (72)Similarly we have V ∗ ( C ) ≤ V U ( C ) , ∀ C ∈ C given V ∗ ( C ) = V U ( C ).34 .3 Notation table coupon relatedface value v time to expire T number of coupons in the group n coupon ˜ c coupon group c default (zero-valued) coupon group c coupon set C awareness coupon subset C a set of all coupon sets C set of all awareness subset of set C A ( C )model relatedtrip demand generation rate of traveler j λ j mean of log fare log( p (cid:48) x ) of traveler j µ pj standard deviation of log fare log( p (cid:48) x ) of traveler j σ pj estimated trip fare p x vector of trip utilities from different travel modes u xj relative utility gain in taking the target mobility service u xj mode selection of travelers i whether there is a trip served by the target mobility service x (cid:48) realized trip fare p (cid:48) x state of attention S a coupon activation record function I a ( · )coupon redemption utility function r ( · , · )state transition functions of coupon set f ( · , · )state transition functions of coupon f c ( · )state transition function of attention f a ( · , · , · )discount factor γ value function relatedmode selection policy of traveler j π xj ( · , · , · )coupon selection policy of travelers π c ( · , · )expected accumulated utility under policy π U π ( · )utility gain from coupon sets under policy π V π ( · )optimal utility gain from coupon sets V ∗ ( · )lower & upper bound of V ∗ V L ( · ) , V U ( · )approximated utility gain from coupon sets ˆ V ( · )estimation of utility gain from coupon sets (random variable) V ( · )estimation error (random variable) ε V ( · )othersdataset D model parameters θ sigmoid function σ ( · )indicator function I ( · ) Table 9: Summary of major notations35 .4 Estimation results with the whole dataset
Inattention Utility Model and Extensions EvaluationUnaware? θ a θ as Clip? Extra? Scaled? θ ε θ V θ v LL Accuracy0 N/A N/A 0 0 0 0.064 0.852 N/A -0.638 0.6770 N/A N/A 0 1 0 0.177 0.114 0.669 -0.594 0.7080 N/A N/A 0 0 1 1.144 0.665 N/A -0.639 0.6630 N/A N/A 0 1 1 3.676 0.061 0.712 -0.593 0.7061 0.392 2.444 0 0 0 0.379 0.785 N/A -0.560 0.7011 0.613 3.100 0 1 0 0.361 0.440 0.327 -0.547 0.7161 0.385 2.600 0 0 1 6.135 0.760 N/A -0.555 0.7011 0.594 3.841 0 1 1 6.429 0.395 0.365 -0.542 0.7161 0.377 1.722 1 0 0 0.217 0.913 N/A -0.549 0.7141 0.438 1.684 1 1 0 0.293 0.534 0.289 -0.543 0.7181 0.370 1.732 1 0 1 4.814 0.871 N/A -0.549 0.7141 0.437 1.691 1 1 1 6.479 0.505 0.289 -0.543 0.718
Table 10: Estimated parameters and performance in the case with only one coupon on the wholedataset
Inattention Utility Model and Extensions EvaluationUnaware? θ a θ as Clip? Extra? θ ε θ V θ v LL Accuracy0 N/A N/A 0 0 0.124 0.802 N/A -0.832 0.7090 N/A N/A 0 1 0.242 0.329 0.466 -0.795 0.7251 -0.490 2.530 0 0 0.346 0.724 N/A -0.709 0.6991 -0.448 2.723 0 1 0.447 0.490 0.245 -0.693 0.7051 -0.700 1.561 1 0 0.377 0.708 N/A -0.722 0.6971 -0.664 1.577 1 1 0.504 0.482 0.219 -0.710 0.703
Table 11: Estimated parameters and performance on subset |A ( C ) | ≤| ≤