Finish Them!: Pricing Algorithms for Human Computation
FFinish Them!: Pricing Algorithms for Human Computation
Yihan Gao
University of Illinois (UIUC)Urbana, Illinois [email protected] Aditya Parameswaran
University of Illinois (UIUC)Urbana, Illinois [email protected]
ABSTRACT
Given a batch of human computation tasks, a commonly ignoredaspect is how the price (i.e., the reward paid to human workers) ofthese tasks must be set or varied in order to meet latency or costconstraints. Often, the price is set up-front and not modified, lead-ing to either a much higher monetary cost than needed (if the priceis set too high), or to a much larger latency than expected (if theprice is set too low). Leveraging a pricing model from prior work,we develop algorithms to optimally set and then vary price overtime in order to meet a (a) user-specified deadline while minimizingtotal monetary cost (b) user-specified monetary budget constraintwhile minimizing total elapsed time. We leverage techniques fromdecision theory (specifically, Markov Decision Processes) for boththese problems, and demonstrate that our techniques lead to upto30% reduction in cost over schemes proposed in prior work. Fur-thermore, we develop techniques to speed-up the computation, en-abling users to leverage the price setting algorithms on-the-fly.
1. INTRODUCTION
Crowdsourcing is often used to process and reason about un-structured data such as images, videos, and text. The data thusgenerated is typically used as training data for machine learning al-gorithms in applications such as content moderation (i.e., determin-ing if images are suitable to be viewed by a general audience), spamdetection, search relevance estimation, information extraction, andentity resolution. In fact, all of the following companies employcrowdsourcing frequently at a large scale to repeatedly process un-structured data: Google [6], Ebay [2], Microsoft [5], LinkedIn [7],Facebook [7], Yahoo! [11], Twitter [1], Cisco [6], and Yelp [8].Even though crowdsourcing is often used in industry and academia,and has been the subject of many academic papers studying trade-offs between cost, latency and accuracy [18, 26, 37, 38, 39], there islittle to no work on task pricing and its impact on overall cost andlatency: that is, how the price of tasks (i.e., the monetary rewardpaid to workers on completion) must be set or varied in order tomeet cost or latency constraints. Often, the price is set up-front andnot modified, leading to either a much higher monetary cost thanneeded (if the price is set too high), or to a much larger latency
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00. than expected (if the price is set too low). As a result, anecdotally,pricing is seen as somewhat of a “dark art”.In this paper, we wish to address the following question:
Giventhat we have n fixed tasks, how should we vary their price or re-ward over time so that they get completed by a certain deadlineat the least cost possible? Intuitively, it seems that we may wantto start with a low price initially, and then increase it gradually asit gets closer to the deadline. However, there has been no workdemonstrating that such strategies will indeed yield good results inpractice. Furthermore, there are a number of additional complica-tions, even given this very simple scheme: • What should we price tasks initially? • How can we adapt our price setting to the rate at which tasksare picked up? What if tasks get picked up very quickly at theinitial price; should we lower the price, should we keep it same,or should we increase it? What if the opposite happens — thatis, tasks get picked up very slowly at the initial price? • At what time points should we increase the price? Increasingit too frequently may lead to computationally more expensivedecision making (as we will see subsequently), but increasingit too infrequently may result in much higher costs. • At what granularities do we increase the price, and how muchdoes this affect overall cost? • Should we price all the tasks the same, or should we price tasksdifferently? • What if we had a fixed budget, and instead wanted to reduce to-tal latency. Would similar techniques apply then? Would vary-ing price help at all? • How do we ensure that our pricing schemes can be computedwithin a reasonable time, and how can we speed them up? • How are our algorithms impacted by inaccuracies in estimatesof the marketplace dynamics?In prior work, Faridani et al. [17] develop a model for latency incrowdsourcing applications based on Non-Homogeneous PoissonProcesses. They then use this model to describe a simple schemebased on binary search for pricing tasks to complete by a dead-line. However, their scheme is not optimal , that is, it wastes far toomuch monetary cost. In this paper, we leverage their model and in-stead focus on the optimization problem of minimizing cost whilemeeting the deadline with high probability. Overall, our techniquesyield rich dividends — we get up to a 30% reduction in cost ascompared to their scheme on realistic crowdsourcing workloads.This represents a significant reduction in cost especially for userswho run large crowdsourcing workloads with strict deadlines.In this paper, we develop algorithms for two optimization prob-lems, given a set of tasks: one, minimizing cost while meeting timerequirements, and second, minimizing latency while meeting mon-etary budget requirements. For the first, we develop an algorithm a r X i v : . [ c s . G T ] A ug ased on decision theory that gives us near-optimal results. For thesecond, we develop a solution that uses linear programming, thatcan be shown to be optimal under some assumptions. A crucialconcern for us is that the computation is as little as possible, andwe propose various speed-up techniques for this purpose.The contributions of this paper are as follows: • We describe the two problems that we study in this paper for-mally in Section 2. • We develop optimized pricing algorithms that meet a fixed timedeadline in Section 3. Since these algorithms could be compu-tationally expensive, we describe techniques to reduce the com-plexity of these algorithms. • We develop optimized pricing algorithms that meet a fixed mon-etary cost budget in Section 4. • We demonstrate that our pricing algorithms achieve a reductionin cost of up to 30% over prior work on simulations with realdata from a crowdsourcing marketplace, as well as live experi-ments on the same marketplace in Section 5. Furthermore, wedemonstrate that the algorithms are remarkably robust to errorsin the estimates of parameters of the tasks and the marketplace.We cover related work in Section 7 and conclude in Section 8.
2. PRELIMINARIES
In this section, we describe the basic model that we will leverageto design optimized pricing algorithms.We operate on a crowdsourcing marketplace, such as MechanicalTurk [3]. In any crowdsourcing marketplace, users (or requesters)post tasks , often many at a time, and set a monetary price or re-ward for them. At any point, there are many tasks on offer in themarketplace. Human workers arrive at the marketplace at any time,and can leave at any time. When on the marketplace, workers canchoose to work on any of the available tasks. They are allowed towork on a single task at a time. Once they complete a task, they willreceive the reward or price assigned for the task by the requester.In a marketplace, the reward of each task is positively correlatedwith the completion rate: the higher the reward, the shorter thecompletion time. However, in order to determine the best trade-off between cost and completion time, this relationship must beprecisely quantified. For example, we must be able to answer ques-tions like: if we adjust the reward per task from $ . to $ . ,how much do we gain in terms of task completion rate? To answerthese questions, we need a formal model for reasoning about thecrowdsourcing marketplace.In previous work, Faridani et al. [17] studied the problem ofmodeling crowdsourcing marketplace dynamics; the dynamics ismodeled using two independent processes: A Non-HomogeneousPoisson Process is used to model the worker arrivals in the market,and a
Discrete Choice Model is used to model how workers choosebetween tasks in the marketplace. We adopt the same model in thispaper, and focus instead on the optimal pricing problem. To en-able this paper to be self-contained, we describe the worker arrivalmodel in Section 2.1, and the task choice model in Section 2.2.These mathematical models will be used to define the pricing prob-lem formally in Section 2.3.
Faridani et al. [17] show that the arrival of workers in a crowd-sourcing marketplace follows a
Non-Homogeneous Poisson Pro-cess(NHPP) . Note that the standard Poisson process is commonlyused to characterize the counting process of stochastically occur-ring events. The Poisson process has a fixed rate λ . NHPP is ageneralization of the Poisson process, with a rate parameter λ ( t ) , afunction of time [44]. In a NHPP, the number of events that occur during any period of time [ S, T ] follows a Poisson distribution: N [ S, T ] ∼ Pois ( ·| λ = (cid:90) Tt = S λ ( t )) (1)where Pois ( ·| λ ) refers to a Poisson distribution with mean λ .Estimating the arrival-rate function λ ( t ) of a NHPP is more dif-ficult than that for a Homogeneous Poisson Process because of theinfinite dimensionality of the arrival-rate parameter λ ( t ) . There-fore, a common approach is to assume a parametric form for λ ( t ) .For instance, Massey et al. [32] used a piece-wise linear functionto approximate the traffic of telecommunication systems.Figure 1 depicts the number of tasks completed every hoursfor a time range of 4 weeks in Mechanical Turk. The figure de-picts that the variation of worker arrivals follows a process that ap-proximately repeats every week. In this paper, we assume that thearrival-rate function λ ( t ) is periodic, and the variations in the num-ber of worker arrivals are all due to the randomness of the Poissonprocess. Given historical data, the arrival-rate function λ ( t ) can beestimated and used to predict arrival rates in the future. Faridaniet al. [17] provide techniques for learning the λ ( t ) function, anddemonstrate the accuracy of these techniques. In this paper, weleverage these techniques, and assume that λ ( t ) is known. As wewill see in our experimental results in Section 5, our pricing strate-gies are not very sensitive to mistakes in the estimation of λ ( t ) . Note that the NHPP models the arrival of workers to the entiremarketplace, and does not capture whether those workers decide towork on our specific task. An independent Bernoulli process canbe used to model whether each worker (who arrives at the market-place) will decide to work on our task. In other words, we assumethat each arrived worker has an independent probability p of pick-ing our task. Therefore if in any period of time the number of work-ers arrived at marketplace is X , then, assuming there are adequatetasks on offer, the number of workers from those who choose towork on our tasks will follow a Binomial distribution Bin ( X, p ) .The value of p , the task acceptance probability, is not directly ob-servable. We describe how it is related to the price or reward forthe task, and how it can be estimated in Section 2.2.The task completion process is then a composition of NHPP andan independent Bernoulli process. In Statistics literature, such aprocess is called a Thinned Non-Homogeneous Poisson Process [44].A Thinned NHPP is also a NHPP with a modified arrival-rate func-tion λ (cid:48) ( t ) = λ ( t ) p . Faridani et al. [17] used a
Discrete Choice Model to character-ize how workers select tasks from the marketplace. In economics,
Discrete Choice Models are used to estimate the probability ofconsumers choosing a specific product among a range of alterna-tives [33]. Discrete Choice Models can be explained by utility the-ory: each worker chooses the task in the marketplace to maximizethe utility (or net benefit) obtained. Workers may have differentperceptions of his/her utility: it could depends on various factorssuch as hourly wage, number of tasks, task type, easiness of theasks, or the knowledge gain during the process of finishing a task.Utility can not be directly observed, the only aspect that can beobserved is the worker’s behavior in the marketplace.Under this model, the task acceptance probability parameter p issimply the probability that the utility of our task exceeds the util-ity of every other task in the marketplace. Let U i be the utilityof task i in the marketplace based on some worker’s perception,and without loss of generality we assume the utility of our task is U . Then p = Pr ( U > max i (cid:54) =1 U i ) . In the Conditional LogitModel [33] [17], the utility U i of i th task has the following expres-sion: U i = β T z i + (cid:15) i , where z i are all observable attributes thatmay affect the utility of the task, and (cid:15) i accounts for all unobservedfactors that may affect the utility. In the model, the utility U i isassumed to be linearly correlated with all observed attributes withthe shared coefficient vector β . The parameters (cid:15) i are assumed tobe independent with each other and follow the Gumbel distribution.Based on these assumptions, it can be derived that the probabilityof choosing each task follows a Multinomial Logit Distribution : p = Pr ( U > max i (cid:54) =1 U i ) = exp( β T z ) (cid:80) i exp( β T z i ) Now if we are able to change our task reward c , then the attributevector of our task z and the task acceptance probability p will alsochange accordingly: p ( c ) = exp( β T z ( c ))exp( β T z ( c )) + (cid:80) i (cid:54) =1 exp( β T z i ) (2)Equation (2) captures how task acceptance probability is relatedto task price or reward. Faridani et al. [17] suggest using this equa-tion directly in order to calculate the task acceptance probability,with parameters β estimated from historical marketplace data us-ing logistic regression. Another approach is to assume a parametricform of task acceptance probability function, and estimate param-eters during a separate training phase. If we assume that the utilityof our task is a linear function of task reward c , and that the sumof exponentials of the utilities of the other tasks is a fixed constant,then Equation (2) can be rewritten as: p ( c ) = exp { cs − b } exp { cs − b } + M (3)Hence if we have some training data (e.g., estimated value of p ( c ) for different task reward c ), then parameters s, b, M can be esti-mated by statistical regression methods.However, note that the inference of the mapping function p ( c ) isnot the focus of our paper. Here, we will assume that the expres-sion of p ( c ) is already known. We will then use this expression todetermine the optimal reward for each task in various scenarios. Our goal is to design pricing algorithms for batch of N identicalcrowdsourcing tasks. The user may specify either a monetary bud-get restriction (that is, the algorithm must ensure that all tasks arecompleted within a certain expected cost), or a time deadline (thatis, the algorithm must ensure that all tasks are completed within acertain time). The unconstrained variable (monetary cost or overalltime) is minimized.Following our discussion in the previous section, we model workerarrivals to the marketplace as a Non-Homogeneous Poisson Process with a known arrival-rate parameter λ ( t ) . Each worker will pick upour task and complete it with probability p ( c ) , where the value of p depends on the reward c (typically in cents or dollars) for eachtask in our batch of tasks. The form of the mapping function p ( c ) from task reward c to task acceptance probability p is assumed tobe known: thus, we expect our techniques to be leveraged when theuser ends up repeating similar tasks many times over a long periodso that such history is available. This is not a drastic assumptionto make: many companies, including Google, Ebay, Yahoo!, andMicrosoft, repeatedly use human workers for tasks such as contentmoderation, categorization, spam detection, and search relevance.At any time, we can monitor the number of remaining uncom-pleted tasks n . The task reward c can be changed at any time, andthe task acceptance probability p will change accordingly. Notethat some marketplaces may impose a minimum time only afterwhich the task reward may be changed, and our algorithms adaptto that scenario as well. Overall, at any time t , the completion oftasks follows a NHPP with rate λ ( t ) p ( c ) , and for each completedtask, c units of monetary compensation are paid based on the taskreward at that time.Then, the problem is to determine and dynamically vary the re-wards for each as yet unsolved task, such that the total monetarycost expended and the total time used for completing N tasks areminimized. We focus on two scenarios: • Fixed Deadline Pricing (Section 3): In this scenario, the totaltime used to complete all N tasks must be less than a deadline T . The goal is then to minimize the expected total expenditure. • Fixed Budget Pricing (Section 4): In this scenario, the totalmonetary budget B for tasks is fixed upfront. The goal is thento minimize the expected total time to complete all tasks.In Section 6, we describe a number of straightforward generaliza-tions, including optimizing combinations of deadline and budget,capturing multiple task types, and incorporating accuracy and dif-ficulty.
3. FIXED DEADLINE PRICING STRATEGY
It is common for task requesters in a crowdsourcing marketplaceto require their tasks to be completed before a certain deadline.Under this scenario, the reward for each task in the batch of tasksshould be as low as possible while making sure that all tasks can becompleted before deadline.In Faridani’s work [17], a binary search process is used to findthe smallest fixed task reward such that the total expected com-pletion time is before the deadline. However, as implied by theNHPP worker-arrival model and also demonstrated in Figure 1, thetask completion process is highly non-deterministic. Therefore, adynamic pricing strategy should perform much better in terms ofoverall cost in this scenario: If the rate at which tasks are pickedup by workers is faster than expected, we could decrease the re-ward for the remaining tasks to save money; on the other hand, ifthe tasks are picked up slower than expected, we could increase thereward to attract more workers to our tasks.In this section, we design a pricing algorithm to determine howto set the reward for each task at each time point to minimize theexpected total monetary cost, while meeting time constraints. Webegin by modeling our decision process as a Markov process anduse the model to present our basic pricing algorithm in Section 3.1.Since these algorithms may be expensive to compute, we presenttechniques that can help speed up the computation in Section 3.2.Lastly, we consider different objectives in Section 3.3.
Discretization:
Although in principle we may be able to changethe task reward c at any time, utilizing this freedom while design-ing pricing strategies would result in an intractable number of timepoints at which decisions need to be made. Instead, we discretizethe total time before the deadline (i.e., the time between when theasks were submitted to the marketplace and the deadline) into anumber of equal-sized intervals. As we will see later on, beyonda point, discretization does not help, and therefore restricting ourpricing algorithms to make decisions only at discrete time inter-vals does not affect the overall monetary cost, while significantlyreducing the computation involved.We partition all available time (from t = 0 , i.e., start time, to t = T , i.e., the deadline) [0 , T ] into N T small intervals: [0 , T /N T ) , [ T /N T , T /N T ) , . . . , [ T − T /N T , T ) and further enforce that thereward c for tasks may only be changed at the start of an interval. State Space:
After discretization, we can represent the state ofprocessing of the batch of tasks at any time interval using a finiteMarkov chain. The states in this Markov chain are represented bya pair ( n, t ) , where n is the remaining unsolved tasks and t is theindex of current time interval. The initial state is ( N, and allstates in the form of ( n, N T ) are final states (Recall that N T is thetotal number of time intervals).An illustration of the state diagram is shown in Figure 2. Thestates are represented on a grid, where the number of unsolved tasksincreases along the y -axis, and the number of time intervals elapsedincreases along the x -axis. Our goal is then to set the prices c n,t upfront for all n, t , such that we have as few unsolved tasks aspossible when t = N T . Transitions:
Based on Equation (1), X t , the number of tasks com-pleted during the t th time interval follows a Poisson distribution: X i ∼ Pois ( ·| λ = λ t p ( c t )) where c t is the task reward in i thtime interval, and λ t is the total expected number of workers whoarrived at marketplace during the t th time interval: λ t = (cid:90) tT/N T s =( t − T/N T λ ( s ) ds (4)At state ( n, t ) , say the task reward is set to be c n,t ; then, the transi-tion probability between states is: Pr { ( n, t ) → ( n − s, t + 1) | c n,t } = Pois ( s | λ = λ t p ( c n,t )) (5) = e − λtp ( cn,t ) ( λ t p ( c n,t )) s s ! (6) where λ i is defined in Equation (4) and p ( c n,t ) is the task accep-tance probability for the task reward c n,t . The transition probabilityis slightly different when we are close to completion: Pr { ( n, t ) → (0 , t + 1) | c n,t } = Pr ( Pois ( ·| λ = λ t p ( c n,t )) ≥ n ) Figure 2: State diagram of Markov Decision Process. Some possible transitions areomitted in the figure for clarity.
Costs:
In our problem, the transition cost between states is the totalrewards paid for tasks completed in each time interval: cost { ( n, t ) → ( n − s, t + 1) | c n,t } = sc n,t (7)For the final states ( n, N T ) , we assign a fixed penalty for each ofthe remaining unsolved tasks: cost { ( n, N T ) } = n × Penalty , where the value of the parameter
Penalty could be based on ac-tual expenses needed to complete them post deadline (possibly bythe task requester themselves), or simply be set large enough toensure that with high probability no task will remain uncompleted.
Markov Decision Processes:
The problem of determining optimaltask reward c n,t in t th time interval for state ( n, t ) can be viewedas a Markov Decision Process (MDP) . MDPs are commonly usedto model optimization and decision making problems in a discretetime stochastic environment. The goal is of MDP optimization todetermine the policy for every state to minimize the expected over-all cost (in our problem it corresponds to determining the optimaltask reward c n,t for each state). Dynamic Programming:
The above MDP optimization problemcan be solved by
Dynamic Programming (DP) . Let
Opt ( n, t ) de-note the minimum expected total cost for all remaining n tasks forthe state ( n, t ) , and Price ( n, t ) denote the corresponding optimalreward for each task. Then Opt ( n, t ) and Price ( n, t ) satisfy thefollowing equations. Opt ( n, t ) = min c n (cid:88) s =0 [ Opt ( n − s, t + 1) + sc ] × Pr { ( n, t ) → ( n − s, t + 1) | c } Price ( n, t ) = arg min c n (cid:88) s =0 [ Opt ( n − s, t + 1) + sc ] × Pr { ( n, t ) → ( n − s, t + 1) | c } The values of
Opt ( n, t ) and Price ( n, t ) can be sequentially de-termined. That is, we start at ( · , N T ) , and work our way backwardsusing the equations above. Once we have computed the optimal Opt and
Price for all ( · , t + 1) , we can use the equations aboveto compute it for all ( · , t ) — the optimal c n,t can be found by con-sidering all possible price values since it needs to be an integralmultiples of a minimal unit of price (In Amazon Mechanical Turkit is cent). Algorithm 1 gives the pseudocode of this DP algo-rithm. Algorithm 1
Simple Dynamic Programming function F IND O PTIMAL P RICE F OR S TATE ( n , t , L , U ) Opt ( n, t ) ← ∞ for c = L to U do Cost ← , P r ← AcceptRate ← p ( c ) for i = 0 to n do p ← Pois ( i | λ ( t ) × AcceptRate ) Cost ← Cost + p × ( ic + Opt ( n − i, t + 1)) P r ← P r + p end for Cost ← Cost + (1 − P r ) × nc if Cost < Opt ( n, t ) then Opt ( n, t ) ← CostP rice ( n, t ) ← c end ifend forend functionfunction S IMPLE DP for i = 0 to N do Opt ( i, N T ) ← i × Penalty end forfor t = N T − to dofor i = 0 to N do FindOptimalPriceForState( i , t , , C ) end forend forend function The DP algorithm has a time complexity of O ( N N T C ) , where C is the number of price choices we want to consider, which isntractable when N is large or when N T or C are fine-grained.Here we discuss some techniques to speed up the algorithm. Poisson Distribution Truncation:
Notice that while making pric-ing decisions, the DP algorithm enumerates all possible number oftasks s that can be picked up by workers during each time inter-val. However, for large s , the probability that more than s tasks arecompleted in one time interval: Pr ( Pois ( ·| λ ) ≥ s ) = (cid:88) k ≥ s e − λ λ k k ! ≤ e − λ λ s s ! ss − λ becomes negligible, and thus the contribution of those terms in DPupdate formulas will also become negligible.In practice, we could set a threshold (cid:15) for the probability Pr ( Pois ( ·| λ ) ≥ s ) . If for some s , Pr ( Pois ( ·| λ ) ≥ s ) is less thanthe threshold (cid:15) , all the terms s > s can be ignored safely. Table 1shows the value of s for (cid:15) = 10 − and different values of λ . Threshold (cid:15)
Poisson mean λ s −
10 35 −
20 53 −
50 99
Table 1: The value of s for different thresholds (cid:15) and Poisson distribution means The next theorem provides an upper bound of error produced byPoisson Distribution Truncation:T
HEOREM The exact optimal total cost
Opt ( n, t ) and es-timated value of optimal total cost Est trunc ( n, t ) using PoissonDistribution Truncation and the exact total cost Cost trunc ( n, t ) based on the optimal policy obtained using Poisson DistributionTruncation satisfies the following inequality: Est trunc ( n, t ) ≤ Opt ( n, t ) ≤ Cost trunc ( n, t ) ≤ Est trunc ( n, t ) + (cid:15)n ( N T − t ) C where C is the upper bound of task reward in any state. In partic-ular, | Opt ( N, − Cost trunc ( N, | ≤ (cid:15)NN T C P ROOF . The former inequalities can be proved by induction ina very straight-forward manner. The last inequality involving state ( N, is direct implication of former inequalities. Monotonicity of Pricing Decision:
Another speed-up techniquerelies on the following natural conjecture:C
ONJECTURE The optimal reward
Price ( n, t ) for each taskis non-decreasing with respect to n for any fixed value of t . Intuitively, this conjecture says that with a fixed deadline, the moreremaining tasks we have, the higher reward we should set for eachtask. Over repeated trials with many different values of λ, N, N T ,we tried generating optimal strategies (using the basic DP algo-rithm described in the previous section), and the optimal strategiesnever violate the preceding conjecture.If we assume this conjecture to be correct, then the followingcan be used to speed up the DP process. The main idea is toreduce the search range of optimal reward c for each state: sup-pose Price ( a, t ) and Price ( c, t ) are already known, then for any a < b < c , Price ( b, t ) lies in range [ Price ( a, t ) , Price ( c, t )] .Figure 3 illustrates the idea of this algorithm. For time interval t ,we first search for the optimal reward for state ( N , t ) , then states ( N , t ) and ( N , t ) , then states ( kN , t ) for k = 1 , , , . Thisprocess continues until the optimal reward for every state has beenfound. Thus, the optimal reward searching process can be repre-sented using a binary tree, where each node represents the optimal reward search range of certain state, and the search range of opti-mal reward is bounded by optimal reward already found in upperlevel nodes. Further, the search range of nodes in each level sum upto C , the pre-specified upper bound of task reward, while the num-ber of levels is bounded by O (log n ) . Therefore, the algorithm(Algorithm 2) has a time complexity of O ( N T N ( N + C log N )) .Finally, although not improving time complexity, the monotonic-ity of task rewards Price ( n, t ) with respect to t for fixed n (i.e.,when the number of remaining tasks are fixed, the rewards increaseas we get closer to the deadline), can also be used to improve algo-rithm efficiency by reducing the optimal reward search range. Algorithm 2
Efficient Dynamic Programming function F IND O PTIMAL P RICE F OR T IME ( t , l , r , L , R ) m ← (cid:98) l + r (cid:99) FindOptimalPriceForState( m , t , L , R ) pm ← P rice ( m, t ) if l < m then FindOptimalPriceForTime( t , l , m − , L , pm ) end ifif m < r then FindOptimalPriceForTime( t , m + 1 , r , pm , R ) end ifend functionfunction I MPROVED DP for i = 0 to N do Opt ( i, N T ) ← i × Penalty end forfor t = N T − to do FindOptimalPriceForTime( t , , N , , C ) end forend function In our MDP formulation, the penalties for final states cost { ( n, N T ) } are proportional to the number of remaining tasks left unsolved.Therefore, the MDP is optimizing the linear combination of thetotal reward paid for the tasks completed before deadline and thenumber of remaining tasks after deadline: Q = E ( transition cost ) + E ( ) × Penalty
The parameter
Penalty controls the trade-off between thesetwo quantities: higher value of
Penalty results in higher aver-age reward for each tasks and less remaining tasks after deadlineon average.Sometimes, it may be more convenient to directly optimize theexpected total expenditure on crowdsourcing marketplace, with aconstraint on the expected remaining uncompleted tasks after dead-line.
Minimize E ( transition cost ) s.t. E ( ) ≤ Bound
Theorem 2 shows that two formulations are closely related.T
HEOREM For every value of parameter
Penalty , there ex-ists a corresponding value of parameter
Bound such that two for-mulations above result in the same optimal solution. P ROOF . For any fixed value of
Penalty parameter, assume theoptimal solution for the original MDP formulation is
Opt . Let
Bound to be the expected number of unsolved tasks in
Opt . Forany other solution
Sol , if the expected number of unsolved tasks in
Sol is less than or equal to
Bound , then the expected transitioncost of
Sol must be no less than
Opt ’s (Otherwise the optimalityof
Opt is violated). Therefore,
Opt is also optimal in the secondformulation. igure 3: The graphical illustration of the efficient algorithm (Algorithm 2), states are represented as nodes in the tree, the search range of each node can be bounded by the optimalprice of nodes with lower depth
Therefore, for any fixed value of parameter
Bound , we can per-form binary search for the value of parameter
Penalty such thatthe solution to former formulation is also a solution to latter formu-lation.The original final state penalty could extended as follows: cost { ( n, N T ) } = (cid:26) ( n + α ) × Penalty if n > if n = 0 which enforces an extra penalty on the existence of remaining tasks.This formulation may be more suitable for cases where any remain-ing task would be problematic but the number of remaining tasksdoes not really matter. Just like the scenario above, there is a cor-respondence between MDP and the following formulation. Minimize E ( transition cost ) s.t. E ( ) + α × Pr ( > ≤ Bound
Thus, the extended penalty setting would not only bound the aver-age number of unsolved tasks but also bound the probability thatthere exists at least one remaining task.
4. FIXED BUDGET PRICING STRATEGY
In this section we focus on another version of pricing problem:given a total monetary budget for all tasks, our objective is to min-imize the expected total time when all tasks are completed.Although, like in the previous section, we may still change thetask reward dynamically, we will demonstrate that exercising thisfreedom does not help much in this scenario. In fact, we will provethat a static pricing strategy is nearly optimal.
We first define what we mean to be a
Static Pricing Strategy : Definition 1. A static pricing strategy assigns a reward to eachof the N tasks up-front (i.e., at the time the tasks are submitted tothe marketplace), and then does not change this price subsequently.Note that the rewards need not be the same for all tasks.Even though for a static pricing strategy tasks are submitted tothe marketplace at the beginning with possibly different rewards, atany time, only the tasks with the highest reward will be picked upby workers. Thus, the rate at which tasks are picked up by work-ers will depend solely on the highest reward among all tasks (Thisproperty can be shown by Utility Theory in Section 2.2). Later on,when the tasks with the highest reward are exhausted, workers willstart to pick up tasks with a lower reward; as a result the task ac-ceptance rate will drop accordingly.Note that static pricing strategies are a strict restriction of gen-eral dynamic pricing strategies. To see this, observe that for everystatic pricing strategy, there is an equivalent dynamic pricing strat-egy which changes the task reward for all tasks right after each taskis completed. Therefore, the optimal static pricing strategy cannothave a lower total latency than the optimal dynamic pricing strat-egy. However we will show that in fact, the former can have as lowexpected total latency as the latter.
We now show that the optimal static pricing strategy has the min-imum expected total latency for completing a given batch of tasksamong all possible pricing strategies. Our main result will be The-orem 3, described in Section 4.2.1. Subsequent sections will focuson the proof and describe the algorithms.
Recall that from Section 2, the workers arrive at the marketplacefollowing a NHPP, and decide whether to work on our task follow-ing an independent Bernoulli process. Let T be the random variabledenoting the total time elapsed before all tasks are completed, and W be the random variable denoting the total number of workersthat have arrived at the marketplace before all the tasks are com-pleted. Based on our model, the distribution of T conditioned on W depends only on the arrival-rate parameter λ ( t ) , and is indepen-dent of the pricing strategy. Suppose we use a pricing strategy S ,then the expected value of T can be expressed as : E [ T | S ] = (cid:90) W E [ T | W ] Pr ( W | S ) dW Therefore our goal is to choose the optimal pricing strategy suchthat its induced distribution Pr ( W | S ) minimizes E [ T | S ] . Now if E [ T | W ] is linear in W , then we have: E [ T | S ] = (cid:90) W kW Pr ( W | S ) dW = k E [ W | S ] which means that minimizing E [ T | S ] is equivalent to minimizing E [ W | S ] . Minimizing the latter quantity is much more straight-forward as we will show in next few sections. The justification ofthis linearity assumption will be shown in Section 4.2.2.The next theorem states that static pricing strategy is optimalin terms of minimizing the expected number of worker-arrivals E [ W | S ] and therefore expected latency E [ T | S ] . We will prove thetheorem in the next section.T HEOREM There exists a static pricing strategy S that min-imizes the expected number of total worker-arrivals E [ W | S ] , andtherefore minimizes the expected total latency E [ T | S ] among allpossible pricing strategies. In this section we justify the linearity assumption that E [ T | W ] = kW . First notice that T has the following conditional distributionfunction conditioned on W : F T | W ( t ) = Pr ( T ≤ t | W ) = Pr ( N ( t ) ≥ W ) where N ( t ) is the random variable denoting the number of workerswho have arrived at the marketplace between time and time t .Based on the NHPP model, N ( t ) follows Poisson distribution: N ( T ) ∼ Pois ( ·| λ = (cid:90) T λ ( t ) dt ) (8)s shown in Figure 1, λ ( t ) varies periodically and is relativelystable over a long period. Thus (cid:82) T λ ( t ) is approximately propor-tional to T : Λ( T ) = (cid:90) T λ ( t ) dt ≈ ¯ λT where ¯ λ is the average worker-arrival rate in the marketplace. Onsubstituting it into Equation (8) we have, Pr ( N ( t ) ≥ W ) ≈ − W − (cid:88) k =0 Pois ( k | λ = ¯ λT )= 1 − e − ¯ λT W − (cid:88) k =0 (¯ λT ) k k ! Therefore , E ( T | W ) = (cid:90) ∞ (1 − F T W ( t )) dt ≈ (cid:90) ∞ e − ¯ λt W − (cid:88) k =0 (¯ λt ) k k ! dt = W − (cid:88) k =0 (cid:90) ∞ e − ¯ λt (¯ λt ) k k ! dt = W ¯ λ which justifies that linearity assumption. The proof of Theorem 3 relies on another type of pricing strat-egy:
Semi-Static Pricing Strategy . Semi-static pricing strategiesserve as a bridge to connect static pricing strategies and dynamicpricing strategies in the proof of Theorem 3:
Definition 2. A Semi-Static Pricing Strategy generates a sequenceof prices c , c , . . . , c N at the time the tasks are posted to the mar-ketplace. The strategy starts off by assigning c to all tasks, andonce one task is picked up by a worker, the price for all remain-ing tasks changes to c , and so on, until all the tasks are picked upby workers and completed. Unlike the static pricing strategy, thesequence of c i ’s need not be monotonically decreasing.We next show that the best dynamic pricing strategy is as good(i.e., has as low an expected completion time or latency) as the bestsemi-static pricing strategy.T HEOREM The optimal dynamic pricing strategy to mini-mize the expected number of worker-arrivals E [ W ] is in the formof a semi-static pricing strategy. Intuitively, the proof uses decision theory to demonstrate that, fora dynamic strategy, only the decisions made when a task gets com-pleted matter — otherwise the state of the Markov process stays thesame, and need not be changed.P
ROOF . The optimal dynamic pricing strategy that minimizesthe expected number of worker-arrivals E [ W ] can be obtained bysolving the corresponding Markov Decision Process. Since we areconsidering the number of worker-arrivals as cost, the MDP can berepresented by tuple ( n, b ) representing the number of remainingtasks and total budget left, and the transition between states are: Pr { ( n, b ) → ( n − , b − c ) } = p ( c ) Pr { ( n, b ) → ( n, b ) } = 1 − p ( c ) We have used a fact in probability theory that for any non-negativerandom variable X , E ( X ) = (cid:82) ∞ Pr ( X > t ) dt with corresponding cost: Cost { ( n, b ) → ( n − , b − c ) } = 1 Cost { ( n, b ) → ( n, b ) } = 1 where the two state transitions above indicates whether the nextarrived worker will accept the task if reward is c .A special property of this MDP is that each state has only oneoutgoing transition edge, which corresponds to the event that someworker accepts our task and completes it. Therefore, the MDP for-mulation indicates that the task reward will remain unchanged inthe optimal pricing strategy until some task is completed (sinceotherwise the state is still the same). In other words, the optimalpricing strategy is in the form of semi-static pricing strategy.The next theorem states that the effectiveness of any semi-staticpricing strategy is not affected by the order of the c i .T HEOREM For any semi-static pricing strategy S with pricesequence c , c , . . . , c N , then the expected number of worker-arrivals E [ W ] is equal to (cid:80) Ni =1 1 p ( c i ) . P ROOF . Let w i denotes the number of worker-arrivals betweenthe completion time of ( i − th and i th task ( w denotes the num-ber of worker-arrivals before the completion time of the st task).Based on the model assumption in Section 2, we can derive that w i follows geometric distribution: Pr [ w i = k ] = (1 − p ( c i )) k p ( c i ) where p ( c i ) is the task acceptance probability with respect to taskreward c i .The total number of worker-arrivals W can be expressed as sumof w i s plus N workers that actually picked up tasks: W = N (cid:88) i =1 w i + N Taking expectation on both side, we get: E [ W ] = N (cid:88) i =1 E [ w i ] + N = N (cid:88) i =1 − p ( c i ) p ( c i ) + N = N (cid:88) i =1 p ( c i ) which finishes the proof.Thus, by reordering the prices of a semi-static strategy (to ensure adescending order), we can change it into a static strategy with equaltotal expected completion time or latency. This result together withTheorem 4 demonstrates that the static pricing strategies are near-optimal. In this section, we will address the problem of finding the op-timal static pricing strategy. Suppose in the optimal static pric-ing strategy, the rewards for tasks are c , c , . . . , c N . Using Theo-rem 5, we know that the expected total number of worker-arrivals E [ W ] equals the sum of p ( c i ) (since any static pricing strategy isalso a semi-static pricing strategy with the reward sequence mono-tonically non-increasing): E [ W ] = N (cid:88) i =1 p ( c i ) (9)Let n c be the number of tasks with reward c , i.e., n c = |{ i : c i = c }| . Then, Equation (9) can be rewritten as: E [ W ] = (cid:80) c n c p ( c ) .The n c values satisfy the following constraints: (cid:88) c n c = N ; (cid:88) c n c × c ≤ B ; n c ≥ n c ∈ N (10) igure 4: Illustration of Theorem 7, which implies that c and c can only be on theconvex hull where the first constraint is about the total number of tasks, thesecond constraint is about the total monetary budget ( B denotesthe total budget for all tasks).Our objective is to find values of n c that minimizes E [ W ] whilesimultaneously satisfying Constraints (10). For arbitrary functions p ( c ) , it is easy to show that the optimization problem is NP-H ARD .Furthermore, we can show that the optimal static pricing strat-egy solution can be generated using a dynamic-programming basedpseudo-polynomial time algorithm:T
HEOREM The c i for the optimal static pricing strategy canbe discovered in P TIME ( B, N ) . In short, the idea is to consider all optimal allocations of up to B tothe first i tasks, for all i ∈ . . . n .Our approach will instead be to approximately solve the opti-mization problem. We begin by casting the problem as an IntegerProgram (IP). Then, we will relax the IP to a Linear Program (LP)where the variables no longer have to be integers, i.e., the n c ∈ N constraints are excluded. Lastly, we will round up variables in thesolution to the LP to make them integers. The relaxed LP versionof the problem is as follows: Minimize (cid:88) c n c p ( c ) s.t. (cid:88) c n c = N ; (cid:88) c n c × c ≤ B ; n c ≥ Instead of applying an LP solver and then performing rounding,next, we will describe an even faster approach, that leverages aspecial property of the LP above:T
HEOREM There exists an optimal solution for the LP abovewhich satisfies the following: • ∃ c < c , ∀ c (cid:54) = c , c (cid:54) = c , n c = 0 • ∀ c = tc + (1 − t ) c , t ∈ R : p ( c ) ≥ t p ( c ) + (1 − t ) p ( c ) Theorem 7 can be intuitively explained using Figure 4. We firstplot all the pairs ( c, p ( c ) ) in the plane. The first property of The-orem 7 states that there are at most two c i ’s with non-zero n c i ;i.e., there are at most two distinct prices c , c that tasks are setat. Then, the second property of Theorem 7 states that for c and c , there is no other point ( c, p ( c ) ) below the straight line con-necting ( c , p ( c ) ) and ( c , p ( c ) ) . In other words, ( c , p ( c ) ) and ( c , p ( c ) ) can only be segments on the convex hull of points ( c i , p ( c i ) ) .The key idea is to show that given any optimal solution, it can betransformed to satisfy the first property while maintaining its opti-mality. The second property can be derived from the first propertyand Karush-Kuhn-Tucker conditions [12] of the LP.P ROOF OF T HEOREM
7. Suppose n ∗ is the optimal solution tothe above LP problem. Let c = min { c : n ∗ c > } and c =max { c : n ∗ c > } be the smallest and biggest index of non-zerocomponent of n ∗ respectively. We show that the following solution is also an optimal solution: n (cid:48) c = (cid:88) c ≤ c ≤ c n ∗ c c − cc − c n (cid:48) c = (cid:88) c ≤ c ≤ c n ∗ c c − c c − c ∀ c (cid:54) = c , c (cid:54) = c , n (cid:48) c = 0 In order to prove this claim, we need to show that: (cid:88) c n ∗ c c − cc − c p ( c ) + (cid:88) c n ∗ c c − c c − c p ( c ) ≤ (cid:88) c n ∗ c p ( c ) It suffices to prove that: ∀ c, p ( c ) ≥ c − c ( c − c ) p ( c ) + c − c ( c − c ) p ( c ) (11)In order to prove Equation 11, we examine the Karush-Kuhn-Tuckerconditions [12] of this LP: ∀ c, p ( c ) = µ c + λ N − cµ B , µ c ≥ , µ c n ∗ c = 0 where µ c , λ N , µ B are KKT multipliers. Since µ c ≥ , it follows ∀ c, p ( c ) ≥ λ N − cµ B (12)with equality holds on c = c and c = c (since n ∗ c , n ∗ c > im-plies µ c = µ c = 0 ). Substitute Equation (12) into Equation (11)completes the proof of the first part. The second claim is a directimplication of Equation (12) with equality holds on c = c , c .Using Theorem 7, we can derive an algorithm (Algorithm 3) tofind a nearly optimal pricing strategy. The algorithm generates theconvex hull using all possible prices, and then picks the two mostsuitable prices to assign to tasks.Theorem 8 provides an upper bound of the difference betweenrounded-LP solution (i.e., the solution provided by Algorithm 3)and optimal solution of original IP problem. Algorithm 3
Find Optimal Static Pricing Strategy function F IND O PTIMAL S TATIC S TRATEGY for c = 0 to C do Calculate the value of task acceptance probability p ( c ) . end for CH ← Convex hull of points ( c, p ( c ) ) c ← max { c ∈ CH : c ≤ BN } c ← min { c ∈ CH : c > BN } n ← (cid:100) c N − Bc − c (cid:101) , n ← N − n return n tasks priced at reward c ; n tasks at reward c . end function T HEOREM Let { n ∗ } denote the optimal solution that mini-mizes E [ W ] under the Constraint (10), and { ˆ n } denote the rounded-LP solution from Algorithm 3, then the expected total latency dif-ference between two solutions is bounded by: (cid:88) c ˆ n c p ( c ) ≤ (cid:88) c n ∗ c p ( c ) + ( 1 p ( c ) − p ( c ) ) P ROOF . Since relaxation only removed integer restriction (with-out adding any constraints), it implies that { n ∗ } is also a valid solu-tion to the relaxed LP problem. Therefore, the optimal LP solution n L will achieve lower objective function value than { n ∗ } : (cid:88) c n Lc p ( c ) ≤ (cid:88) c n ∗ c p ( c ) ince ˆ n is just the rounded-solution of n L , together with the specialform of n L implied by Theorem 7, we get: (cid:88) c ˆ n c p ( c ) ≤ (cid:88) c n Lc p ( c ) + ( 1 p ( c ) − p ( c ) ) Combining two results completes the proof.
5. EXPERIMENTS
The goals of our experimental evaluation are two-fold: (a) to val-idate the pricing model assumptions we made in the previous sec-tions, and (b) to compare our techniques versus others on simula-tions based on real crowdsourcing marketplace data, as well as realexperiments deployed on a crowdsourcing marketplace. In Sec-tion 5.1, we examine the validity of the task acceptance probabilityequation (Equation (2)) and estimate the typical task acceptanceprobability values for real tasks. In Section 5.2, we examine theeffectiveness (in terms of total monetary cost) of our techniques forthe fixed deadline problem from Section 3 as compared to otherschemes under simulations based on real workloads from Ama-zon’s Mechanical Turk Marketplace via the mturk-tracker web-site [4]. We also study the sensitivity of our techniques with re-spect to (a) the algorithm parameters, (b) the estimation error ofarrival-rate, and (c) the task acceptance probability mapping func-tion, since many of these parameters may only be estimated ap-proximately. In Section 5.4, we deploy our pricing technique forthe fixed deadline problem from Section 3 on Mechanical Turk andreport effectiveness in practice. (In Section 5.4.3, we present somedata analysis of the data collected as a result.) In Section 5.3, we ex-amine the completion times of our techniques for the fixed budgetproblem from Section 4 under simulations based on real workloads.
In Section 2.2, we used Equation (2) to map task rewards to taskacceptance probabilities. In this section, we experimentally vali-date Equation (2) using utility theory (Section 5.1.1) and estimatethe parameters in Equation (3) for real tasks (Section 5.1.2).
As described in Section 2.2, workers choose tasks to work onby maximizing their gain in utility. In this section we simulate aspecific workers’ choice based on utility theory to justify the formof Equation (2).The experiment settings are the following: • The total number of tasks on the marketplace is set to be . • The worker’s utility estimate U i for task T i ( i > follows anormal distribution N ( µ i , σ i ) , where µ i are sampled indepen-dently from the normal distribution N (0 , , and σ i are sam-pled independently from the uniform distribution U [0 , . • The worker’s utility estimate U for our target task T followsa normal distribution N ( µ = c − , σ ) where c denotes thetask reward of our task T and σ is sampled from the uniformdistribution U [0 , .For a given c (i.e., the reward for our task), we repeatedly samplethe utility estimates for each of the tasks as described above,and assume that the worker will choose our task if and only if ourtask has the highest utility among all the tasks in the marketplace.This sampling process gives us an estimate of the task acceptanceprobability p for a fixed reward c . We then repeat this process fordifferent values of c , and plot the simulated acceptance probability p over different values of c in Figure 5. In the figure, we also de-pict the corresponding regression curve based on Equation (2) forcomparison (the value of β is learned by fitting the simulated task acceptance probability value). As can be seen in Figure 5, the sim-ulated acceptance probability p is well predicted by Equation (2).This justifies the model assumption that p is proportional to theexponential of the task utility U i . Figure 5: Simulated task acceptance probability p with reward c ranging from to . Blue dots are simulation results and red curve is the regression function basedon Equation (2) with z i = µ i and β = 2 . . In Section 2.2, we use Equation (3) as parametric form of taskacceptance probability function. In this section, we aim to estimatethe typical values of parameters s , b , M in Equation (3) for taskson a real marketplace.We retrieved the snapshots of Amazon Mechanical Turk [3] frommturk-tracker [4]. The snapshots of the marketplace are taken ev-ery 20 minutes; we estimate the number of tasks that are completedevery minutes by subtracting the number of remaining tasks ineach task group (note that in Mechanical Turk a task is called a HITand a group or batch of tasks is called a HIT group). If the num-ber of remaining tasks increased during the minute window(i.e.,the task requester added new tasks to this task group), we simplyassume no tasks was completed during that minutes.We sampled task groups that had at least tasks com-pleted (we enforced this threshold to filter out spam tasks) from1/1/2014—1/28/2014, and for each task group we manually esti-mated the approximate average time usage for completing one task.Figure 6 shows for the two most popular task types, the wage persecond versus average completed workload per hour , defined as: workload per hour = average × average time usage of each task We use these two values as axes because we want to make sure thatour figure is invariant under task bundling. (In Mechanical Turk,requesters often group several tasks into one larger task.) (a) Categorization (b) Data CollectionFigure 6: The plot of tasks in Amazon Mechanical Turk, x axis represents the wage persecond($/sec), y axis represents the average completed workload per hour(sec/h)
In order to estimate the value of parameters s , b , M in Equa-tion (3), we assume that the utility of each task equals the logarithmof workload per hour , as implied by Equation (2) if we assume thesum of the exponential of the utilities of all tasks is a constant. Wefurther assume that the utility of each task is linearly correlatedwith the wage per second attribute: log workload/hour = utility = α × wage/sec + b + (cid:15) here b is task-type bias term and (cid:15) accounts for all other factorsaffecting utility. We then apply Least Square Regression to estimatethe linear coefficient α and the bias term b for each task type.Table 2 shows the result of Least Square Regression. The twolinear coefficients are approximately the same, implying that thelinear coefficient of the wage per second attribute is the same forall task types. The bias term of Data Collection tasks is significantlyhigher than Categorization tasks, implying that workers in Mechan-ical Turk prefer Data Collection tasks to Categorization tasks. Linear coefficient BiasCategorization 748 3.66Data Collection 809 6.28
Table 2: Linear coefficients and bias terms generated using Least Square Regression
Using the results in Table 2, we can then estimate the parameters s , b , M in Equation (3). Say our task is a Data Collection task andthe average completion time of our task is seconds, then basedon Table 2, we have (task reward c is in cents): workload per hr. = exp { × c × . } = total × p ( c ) × where total denotes the total number of tasks completed per hourin the crowdsourcing marketplace (including all other tasks). InMechanical Turk we have total ≈ (as seen in mturk-trackerdata). Using this fact we derive the following expression for p ( c ) : p ( c ) ≈ exp { c + 0 . } exp { c + 0 . } + 2000 (13) In this section, we examine the effectiveness of the dynamic pric-ing strategy in Section 3. We compare our dynamic pricing strategyagainst the binary-search-based fixed pricing strategy in Faridani’swork [17]. We first compare the two pricing strategies under a re-alistic crowdsourcing workload in Section 5.2.1. We then study thetrend of relative reduction in cost of our dynamic pricing strategycompared to Faridani’s fixed pricing strategy under different prob-lem settings in Section 5.2.2. We further examine the sensitivity ofdynamic pricing strategy to the granularity of time discretization inSection 5.2.3. The sensitivity of both pricing strategies to the esti-mations of task acceptance probability function and future arrival-rates are examined in Section 5.2.4 and Section 5.2.5 respectively.In the following experiments, we assume the following defaultsettings unless explicitly stated: • The total number of tasks N = 200 . • The total time before deadline T = 24 hours. • We retrieved the number of tasks completed during every 20minutes interval for the time period from 1/1/2014 to 1/28/2014from mturk-tracker as described in Section 5.1.2. The workerarrival rate λ ( t ) is set to be piecewise constant on every such minute time interval, i.e., for each time interval, λ ( t ) is setto match the retrieved arrival data. • Our target task is assumed to be a Data Collection task with anaverage completion time of minutes. The mapping functionbetween task reward c and task acceptance probability p can bederived as in Equation 13 in Section 5.1.2. • The dynamic pricing model is trained using a time interval oflength -minutes. In this section we examine the effectiveness of our dynamic pric-ing strategy under a realistic workload derived from mturk-tracker.We compare our dynamic pricing strategy with fixed price strate-gies which assigns a fixed reward to all tasks in advance, deter-mined using binary search, and does not change the reward after-wards. Figure 7(a) shows the results of our experiment: for various values of the average reward (y axis), we plotted the expected num-ber of tasks that remain unsolved at the deadline (x axis)—the totalreward can be estimated by multiplying the average reward with thenumber of tasks. The
Penalty parameter (see Section 3.1) is setin our dynamic pricing strategy such that the expected number ofremaining tasks matches those of the fixed pricing strategy.
Figure 7: (a) Simulated average task reward c of our dynamic pricing strategy with re-spect to different threshold for the expected number of remaining tasks after deadline.(b) Percentage Cost Reduction with various settings of N and T From Figure 7(a), we see that with low expected number of re-maining tasks after deadline (less than remaining tasks on aver-age), the dynamic pricing strategy achieves an average task rewardbetween and . . In fact, we can show that this average rewardis very close to the theoretical lower bound of average task reward c for any pricing strategy, which satisfies the following equation: p ( c ) = N (cid:82) T λ ( t ) dt . In our experiment, c ≈ .The task reward c has the following intuitive meaning: Sup-pose that we have an infinite number of tasks that can be pickedup by workers. Let X denotes the number of tasks completedbefore deadline. Then c is the minimum task reward such that: E [ X ] ≥ N . However, in practice we want to complete all N tasksbefore deadline, which is equivalent to: Pr ( X ≥ N ) ≈ . Notethat any pricing strategy satisfying the second constraint will auto-matically satisfy the first constraint. Therefore, in order to achievea high probability guarantee, the average task reward will necessar-ily be higher than c (since c is the minimal possible task rewardto satisfy the first constraint of E [ X ] ≥ N ).Figure 7(a) shows that our dynamic pricing strategy can finish alltasks by the deadline with very high probability (99.9%) and only3% overhead (as compared to c ). On the other hand, for the fixedreward pricing strategy from [17], the task reward needs to be set at to achieve the same guarantee, resulting in a 33% increase overour dynamic pricing strategy, a significant difference in cost. In this section, we examine the relative gain of the dynamic pric-ing strategy compared to a fixed pricing strategy under various set-tings. We compute the cost reduction achieved by using the dy-namic pricing strategy instead of the fixed pricing strategy (in per-centage), and study how the reduction changes when the parametersare varied. The experiment settings are listed below: • We study the relative effectiveness of dynamic pricing strategywith respect to the above five parameters of the experiment: N , T , and three parameters s , b , M in Equation (3). • Each time we vary only one experiment parameter ( N , T , s , b , M ) while keeping other parameters fixed. The default valueof these experiment parameters are: N = 200 , T = 24 hours, s = 15 , b = − . , M = 2000 (same as before). • We will compute the total cost of all tasks using both pricingstrategies. Let c d and c f be the total cost of the dynamic pricingstrategy and the fixed pricing strategy respectively, the percent-age cost reduction r is defined as: r = c f − c d c f . The percentagecost reduction r serves as a measure of the effectiveness of thedynamic pricing strategy as compared to the fixed pricing one. For both the dynamic pricing strategy and the fixed pricingstrategy, the task reward is chosen such that all tasks are fin-ished by the deadline with . % confidence. This will be thedefault setting for the following experiments.Figure 7(b) shows the percentage cost reduction under various set-tings of N and T . The experiment shows that the percentage costreduction decreases as N increases and increases as T increases.Therefore, if we have less number of tasks and time before deadlineis longer, then the gain of the dynamic pricing strategy is higher. Onthe other hand, the gain of the dynamic pricing strategy is lower ifwe want to complete more tasks in a shorter period of time. Theintuitive explanation for this behavior is that with a longer periodof time we have the ability to plan ahead and vary the price overtime to get additional monetary cost savings.Figure 8(a)–(c) shows the trend of percentage cost reduction whenthe parameter value of s , b , M changes. The implications can besummarized as follows: • The gain of the dynamic pricing strategy is stable no matterhow much the task acceptance probability p is sensitive to taskreward c (Figure 8(a)); • The gain is lower if the task content is intrinsically more attrac-tive compared to other tasks (Figure 8(b)); • The gain is higher if there are less tasks in the crowdsourcingmarketplace than average (Figure 8(c)). (a) Cost Reduction w.r.t. s (b) Cost Reduction w.r.t. b (c) Cost Reduction w.r.t. M (d) Average Task Price for Time Granular-itiesFigure 8: (a – c) Percentage Cost Reduction on varying s , b , M (d) Task price varia-tion with Granularity In this section, we examine the effects of different time intervalgranularities. We train our dynamic pricing strategy using differ-ent time interval lengths, and examine the corresponding trade-offbetween effectiveness of pricing strategy and training time. Thelength of time interval used for training the dynamic pricing strat-egy ranges from minutes to hours.Intuitively, the average task price should increase as the lengthof time interval increases since the strategy space is reduced; ourexperiment results in Figure 8(d) depict the expected behavior: Theaverage task price increases steadily (but not by too much) as thelength of time interval increases. On the other hand, the algorithmrunning time is rather stable and is not affected by the length oftime interval (the algorithm running time is between secondsand seconds for all experiments, by executing Python code ona laptop with an Intel i7 processor). The stable behavior of run-ning time is probably because of the Poisson truncation technique (a) Remaining s (b) Average task reward w.r.t. s (c) Remaining b (d) Average task reward w.r.t. b (e) Remaining M (f) Average task reward w.r.t. M Figure 9: Simulated average number of remaining tasks under inaccurate parameterestimation of p ( c ) for dynamic pricing strategy and fixed pricing strategy(left) andaverage task reward for dynamic pricing strategy(right) in Section 3.2: the expected number of workers arriving into themarketplace during each time interval will decrease as the lengthof time interval decreases, and the corresponding Poisson trunca-tion threshold will also decrease. These results argue for using assmall a time interval for which we can reliably obtain λ ( t ) data. Our dynamic pricing strategy (as well as Faridani’s fixed pricingstrategy [17]) requires estimation about the task acceptance prob-ability mapping function p ( c ) as input. However, these estimatesmay sometimes not be perfectly accurate. In this section, we exam-ine the sensitivity of our pricing strategy to the estimation accuracy.We train our dynamic pricing strategy under an inaccurate es-timate of p ( c ) , and test it using the real value of p ( c ) . The taskacceptance probability function is as Equation 13. For each ex-periment, we vary one parameter of the real p ( c ) to examine therobustness of our dynamic pricing strategy. The estimation of otherparameters are assumed to be accurate.Figure 9 shows the average number of remaining tasks (left fig-ures) and average task reward (right figures) for our dynamic pric-ing strategy with respect to different values of parameters s , b , M for the real p ( c ) . We focus first on the left figures, indicating theaverage number of remaining tasks. The data for the fixed pric-ing strategy (for various values of fixed price — . . . ) is alsoadded for comparison. Here, unlike the fixed price strategies that allhave non-zero remaining tasks, the dynamic pricing strategy curveis not visible because the number of remaining tasks is very closeto zero. Thus, we see that our dynamic pricing strategy is muchmore robust under inaccurate parameter estimation compared tofixed pricing strategy: it returns 0 remaining tasks with very highprobability, while the fixed pricing strategy completely fails to finishall the tasks on time.
The right figures (depicting only the dynamic a) Average remaining pricing strategy) show how the dynamic pricing stays robust: as theparameters are increased, even though the dynamic pricing strategyhas been learned on incorrect parameters, it automatically increasesthe task reward as necessary.
Our dynamic pricing strategy and the fixed pricing strategy [17]both require the prediction of future worker arrival rate. There willbe some inevitable discrepancy between the predicted and actualarrival-rates because of the intrinsic variations in arrival-rate. Inthis section, we examine the stability of our dynamic pricing strat-egy against such discrepancies.We divide the historical arrival-rate data retrieved from mturk-tracker into two separate parts: one for training and the other fortesting. We train our pricing strategy using the training arrival-rate data, but apply it on the test arrival-rate data. This way, weallow the algorithms to predict the general trend of the arrival-rate;however the algorithms will not be aware of the actual arrival-rate.We test our pricing strategies on different days in year 2014: 1/1,1/8, 1/15, 1/22. The training arrival-rate is the average arrival-rateof the other days.Figure 10(a) and 10(b) show the experiment results of the av-erage number of remaining tasks and the average task reward re-spectively. As can be seen in the figures, both pricing strategies arerelatively stable except for 1/1.The surprising results for 1/1 can be explained by comparingFigure 10(c) and 10(d). Figure 10(d) shows the training arrival-rateand testing arrival-rate for 1/22: the training data is mostly in accor-dance with testing data, except that there are several random spikesin the testing data. The experiment results show that both pricingstrategies are relatively stable to this kind of prediction error. Onthe other hand, Figure 10(c) depicts a consistent deviation betweentraining data and testing data on 1/1, in which case both pricingstrategies performs poorly. Such a consistent deviation is probablydue to the special date of 1/1, so these deviations shouldn’t occurvery frequently in practice. Naturally, the prediction of arrival-rateon special days is hard to do because it does not follow a normalweekday pattern. As a result, adaptive prediction techniques suchas predicting the arrival-rate in next few hours based on arrival-ratein last few hours could be useful in such cases. We leave explo-ration of such adaptive schemes for future work. In this section, we simulate the static pricing strategy in Sec-tion 4 and study the distribution of finishing time. The experimentsettings are as follows: • The total number of tasks N = 200 , the total budget B = 2500 cents. • The mapping function between task reward c and task accep-tance probability p is still the same as Equation 13. • Arrival-rates are retrieved from mturk-tracker as before.Figure 11 shows the simulation result. The average completiontime is . hours. However, any completion time between and hours is possible. Thus, the static pricing strategy does not try toguarantee any upper bound on the completion time but rather aimsto minimize the completion time in expectation. Figure 11: The simulated distribution of completion time
In this section, we conduct experiments on Mechanical Turk toexamine the effectiveness of our dynamic pricing strategy fromSection 3 in practice. In Section 5.4.1 we first deploy the fixedpricing strategy on Mechanical Turk to collect data about workerarrival rate λ ( t ) and task acceptance probability function p ( c ) . InSection 5.4.2 we deploy our dynamic pricing strategy based oncollected data, and experiment results are compared to the fixedpricing strategies. In Section 5.4.3, we analyze the data collectedfrom both experiments to provide some other interesting insightsinto workers’ behaviors. Common experiment settings are listedbelow: • We use an entity resolution task dataset from Joglekar et al. [23].Each task in the dataset consists of two photos (each with oneathlete), and the worker is asked whether they contain the sameperson. In all experiments, we have 5,000 pairs of photos thatwe want workers to label. • In all experiments, we post tasks on weekdays at 8 a.m. PST,with the deadline as 14 hours after start time (i.e., 10 p.m. PST). • The worker qualifications are: worker must have at least 90%approval rate in history and live in United States.Note that in Mechanical Turk, HITs (i.e., the unit of work in Me-chanical Turk) with different price are grouped differently, even ifthey are issued by the same requester. Thus, workers looking forour specific HITs may not be able to know how many there are intotal. So, in our experiments, we considered two options to varyprice: (a) per HIT, keep the base price and number of tasks fixed,and vary the amount of bonus provided to the worker, or (b) perHIT, keep the base price fixed, and vary the number of tasks. Wedecided to go with the second option. In our experiments, the priceof each task group (i.e., HIT in Mechanical Turk) is fixed at $0.02,and the price difference is expressed by the number of tasks (i.e.,number of photo pairs to be labeled) in each HIT.
The fixed pricing experiment consists of five trials, where eachHIT contains 10/20/30/40/50 tasks respectively. Given the totalumber of tasks in the trials is fixed at 5,000, the actual numberof HITs posted to the marketplace is 500/250/167/125/100 respec-tively. In other words, in the five trials, the price for each taskis implicitly $0.002/0.001/0.00066/0.0005/0.0004 respectively. Westopped at 50 tasks per HIT, to limit worker fatigue. Figure 12(a)shows the number of HITs completed during the whole time period.As can be seen from the figure, the HIT completion rate is posi-tively correlated with the price of each task in general: for instance,when the elapsed time is 6 hours, the trial with 10 tasks per HIThas more than double the number of HITs completed than that with20 tasks, and more than four times the number of HITs completedthan that with 30, 40 or 50 tasks. When the number of tasks in eachHIT is below 20, the task completion rate becomes high enough tohave all tasks completed before the deadline (i.e., 14 hours). Thetask completion rates are very close for trials with grouping size30/40/50, which can be explained by the small difference betweenunit task prices ($0.00066 / $0.0005 / $0.0004).However, the actual work completion rates (in terms of percent-age of total work completed) are quite different when we take thedifference of number of tasks in each HIT into account, as shownin Figure 12(b). Perhaps somewhat surprisingly, we see that aftermultiplying the number of tasks in each HIT to the quantities inFigure 12(a), the curve of the trial with grouping size becomessignificantly higher than the curves of the trials with grouping size and . This phenomenon suggest that grouping size per HIThas considerable effect on work completion rate: while workerschoose HITs based on unit time wage, grouping more tasks intosingle HIT tends to force workers to stay on the same HIT for alonger time. (Note that in Mechanical Turk, workers do not earnany reward until they have completed all tasks in a HIT.) To make experiment results comparable, the basic settings set-tings of the dynamic pricing experiment are the same as Section 5.4.1(i.e., start time, deadline, total number of tasks), except that thegrouping size is changed every hour based on our dynamic pricingstrategy. The grouping size is chosen from 10/20/30/40/50, and thecorresponding HIT acceptance rates are estimated from the fixedpricing experiment in the previous section. The worker arrival ratesare estimated by averaging normalized worker arrival data in thefive fixed pricing trials.Figure 12(c) depicts work completion rate of the five trials (oneon each day) in our experiment. As can be seen in the figure, the dy-namic pricing strategy ends up completing all the tasks well beforethe deadline (6 hours instead of 14 hours). Furthermore, we findthat the average total cost for the five trials is $3.2, which is muchless ( ≈ less) than the total cost of $5 for the fixed pricingstrategy with grouping size ; in fact, note that the fixed pricingstrategy with grouping size 20 had an elapsed time of 8, two hoursmore than the elapsed time of any of the trials for our strategies. In this section, we further analyze the data collected from previ-ous experiments to study the behavior of workers.
Group Size
10 20 30 40 50
Average Accuracy . . . . . Table 3: The average accuracy of answers in the fixed pricing experiment
Figure 13 and Figure 14 depict the cumulative distribution of ac-curacy of worker’s answers under different task price settings infixed pricing and dynamic pricing experiments respectively. Thetwo curves in the dynamic pricing experiment are the cumulativeaccuracy distribution of answers when the dynamic pricing strat-
Figure 13: Answer quality under different prices in fixed pricing experimentFigure 14: Answer quality under different prices in dynamic pricing experiment egy picks a grouping size of or . We only plot these twocurves for the dynamic pricing case, because the other groupingsizes are rarely used by the dynamic pricing strategy in our exper-iments. Overall, we find that the curves (the five in Figure 13 andthe two in Figure 14) are all very similar, indicating that the pric-ing does not affect quality much. Note that the group- curve(and the group- curve too, to some extent) in the fixed pricingplot appears “jagged”, while the other group sizes have a smootherplot. This is probably because there are fewer tasks in that trial (re-call that we are fixing the total number of questions, so when thegrouping size increases, the total number of tasks decreases), andthe number of possible accuracy values is larger (which means thatcurve will be less smooth when connecting points to draw the cu-mulative distribution curve). The average accuracy of answers inthe two experiments are shown in Table 3 and Table 4 respectively.The experimental results show that the average accuracy of an-swers are all reasonably good (higher than or close to 90% accu-racy), and their differences are not statistically significant. This re-sult suggests that, under our experimental conditions, pricing mainlyaffects whether workers choose to work on the HIT or not. If theydecide to work on one of our HITs, the answers then provided arereasonably good. Studying the general correlation between taskprice and answer quality requires additional in-depth experiments,which are beyond the scope of this paper. Figure 15: Average number of tasks completed by each workerigure 12: Experiments on Mechanical Turk (a) The HIT completion rate for fixed pricing strategy (b) The percentage work completion rate for fixed pricing strategy (c) Thepercentage work completion rate for dynamic pricing strategy
Trial
Ave. Accuracy w/ group size
20 92 . . . . . Ave. Accuracy w/ group size
50 89 . . . . . Overall Ave. Accuracy . . . . . Table 4: The average accuracy of answers for tasks with group size and indynamic pricing experiment Figure 15 shows the average number of HITs completed by eachworker under different pricing settings in the fixed pricing experi-ments in Section 5.4.1. As shown in the figure, with the lower taskprice, workers tend to leave after they completed one or two HITs.On the other hand, when task price is higher, some workers willtend to continuously work on the same kind of task. Note that theNHPP model in Faridani’s work [17] does not explicitly model thisphenomenon. By incorporating this behavior into NHPP model,the worker-arrival rate could be predicted more accurately, and thiscould potentially improve the effectiveness of our dynamic pricingstrategy.
6. DISCUSSION
In this section, we discuss some possible generalizations of ourpricing schemes, as well as some possible impact of our dynamicpricing strategy on worker behavior.
Multiple Task Types
In some cases, we may have multiple types of tasks that need tobe completed by a certain deadline. For instance, we may have100 categorization tasks, and 500 labeling tasks that all need tobe completed at the same time. Multiple task types is easy to in-corporate into our model. We simply represent the state as a vec-tor ( n , n , . . . , n k , t ) , where n i represents the number of tasksof type i . The resulting objectives, relationships, and dynamicprogramming-based optimization algorithms are similar. Incorporating Quality Control for Filtering Tasks
In [37], we describe an MDP based technique to optimize for costand accuracy, specifically for filtering or rating tasks. At a highlevel, our algorithm from [37], given an accuracy threshold overall,generates a per-item quality-control strategy guaranteeing specifiedaccuracy with the minimum total number of questions in expecta-tion (ignoring pricing per question). A quality-control strategy isrepresented using a collection of points ( x, y ) , where x is the num-ber of No answers for that item and y is the number of Yes answers,and decisions, and for each point there is a decision associated withthat point, either continue asking questions for that task, or stop andreturn PASS/FAIL — representing the fact that the task either sat-isfied the filtering predicate or did not. Now, we can generalize thisapproach as well as the approach described in this paper to give asolution optimized for cost, latency, and accuracy.Consider the problem where we once again have N filteringtasks that we need to complete, and our goal is to minimize ex-pected cost, while ensuring that accuracy is within threshold, and that tasks are completed by a certain deadline. As a first step,we generate the per-task quality-control strategy using algorithmsfrom [37], ensuring the minimum number of questions are used,while guaranteeing that the accuracy is within threshold. Let thisquality-control strategy have k points ( x, y ) , corresponding to k different combinations of the number of . . . , k , remaining time)instead of( P denote the vector ( n , . . . , n k ) ,representing the number of tasks in each of the k points of thequality-control strategy, and P (cid:48) denote the vector corresponding tothe number of tasks in each of the k points (in the quality-controlstrategy) that the tasks transitioned to by the next time interval. welet s represent the number of additional answers between the twopoints (that is, the number of additional Yes/No answers). Then,the probabilities are as follows: Pr { ( P, t ) → ( P (cid:48) , t + 1) | c P,t } = Pr ( P → P (cid:48) | s, P ) · Pois ( s | λ = λ t p ( c P,t )) The latter term is as Equation 5, while the first term can be com-puted using probability machinery from [37]. Overall, the com-plexity is: O ( N k N T C ) , which can be large if k is large. (Notethat when k = 1 , we default to the setup from Section 3). Thus,our problem is fundamentally challenging if k is large. Typically,though, k is often as small as , say when a small majority votequality-control strategy is used.Recognizing the fact that the problem may be intractable when k is large, we present next some approximate techniques for thisproblem. It remains to be seen which of these techniques wouldbe better suited for the problem and lead to better approximations.While the first technique has guarantees (but only asymptotically)the second technique does not have any guarantees, but is moretractable and easier to understand: • Representing Using Posterior Probabilities:
For the caseswhen k is large, we can approximate the quality-control strat-egy generation process, as described in [36], where we mappoints in ( x, y ) to the segments in the real line: [0 , a )[ a, a ) . . . [1 − a, here the interval [ a, a ) represents the fact that the poste-rior probability of the item satisfying the filter is between a and a . Thus, given a point ( x, y ) will map to a segment [ ia, ( i + 1) a ) along this real line, where ia ≤ Pr [ item is a | ( x, y )] < ( i + 1) a We then regard every point ( x, y ) that maps to such an inter-val [ ia, ( i + 1) a ) as having posterior probability ia + a/ .The algorithm stays unchanged, except that the k points arenow represented approximately by these /a intervals; andthe complexity is now O ( N /a N T C ) . As shown in [36], as a → , the optimal strategy in this representation (with in-tervals) tends to the optimal strategy in the old representation(with points) asymptotically. This argument follows fromstandard arguments for discretizing continuous state markovdecision processes, e.g., [10]. • Keeping Quality Control Separate from Pricing Optimiza-tion:
The second approximation technique treats quality con-trol as an orthogonal problem. Once we compute the quality-control strategy, for each point ( x, y ) in the quality-controlstrategy from [37], we can compute the worst case additionalnumber of questions. For instance, if the strategy extends allthe way to x + y = 5 , then the worst case additional num-ber of questions at (0 , is . On the other hand, the worstcase additional number of questions at (2 , may be just 1if both (3 , and (2 , —-which are the two points reach-able from (2 , on getting an additional answer—are endstates where a decision of PASS/FAIL is made. Then, wecan apply our technique from Section 3 to the problem with N (cid:48) = N × α , where α is the worst case additional number ofquestions from the origin. Now, we have a strategy designedfrom (0 , to ( N (cid:48) = Nα, T ) where N (cid:48) represents the totalworst case number of questions across all tasks. We then runthe strategy as before, while implicitly moving each task onthe quality control strategy as well, and having that influencethe N (cid:48) (i.e., the total worst case number of questions acrossall tasks) corresponding to where the strategy is currently at.That is: N (cid:48) = (cid:88) all tasks i worst case additional questions at P ( i ) where P ( i ) denotes the point on the quality control strat-egy that the task is at. We explain this using an example.Let there be 10 tasks, and let the quality-control strategy wedesire to use be a majority vote strategy with 3 questions(i.e., ask 3 questions and take the majority). Then, the worstcase number of questions at point (0 , in the quality-controlstrategy will be . So we will begin the strategy at (10 × , . After some time, let 5 tasks be at (1 , , while 2 reach (2 , and 3 reach (0 , . In the first case, the worst case ad-ditional number of questions is 1, while the other two taskshave worst case additional number of questions as 0. Thus,overall, we are now at (5 × × × , T i ) ,where T i is the current time. The reason we use the worstcase additional number of questions is to be conservative inorder to meet the deadline on time, at potentially additionalcost. We could instead use the expected additional number ofquestions, but we may end up not meeting the deadline. Thecomplexity of the DP algorithm for this technique is very rea-sonable: O (( Nk ) N T T ) . (Note that the worst case numberof questions from the origin can at most be k , but could bemuch smaller.) If we had a prior distribution on difficulty, we can easily take thatinto account in the quality-control strategy, as described in [36]. Optimizing Tradeoff between Deadline and Budget
In some scenarios, we may have neither a fixed deadline or a fixedbudget, and we may want to achieve some optimal tradeoff betweenthe two. We now focus on optimizing a linear combination of ex-pected cost and time. Thus, our objective is now: Q = E ( cost ) + α E ( latency ) We consider two scenarios: the first, which makes more assump-tions, and a more general scenario next. The first scenario will actas a “building block” for the second.
Fixed Rate:
We first focus on optimizing the objective under theassumption that the rate at which workers appear in the marketplaceis fixed at λ , i.e., λ ( t ) = λ, ∀ t . (This assumption is a bit more dras-tic than the assumption in Section 4, where we assumed that the rateis not fixed but is constant over a long period.) Given that we arediscretizing time units as multiples of 1, the rate at which workersappear in the marketplace is the same as the expected number ofworkers who appear in the marketplace in a unit time interval. Figure 16: State transition diagram
Turns out, under such a scenario, we do not need to record cost ortime, since we do not have a deadline, and since the amount of timeelapsed or cost consumed until one gets to a given state is “sunkcost/time”. Thus, the states are simply recorded using ( n ) , where n is the number of outstanding tasks. We let c n denote our pricefor all tasks at time n . Our time interval discretization will be set tobe small enough that the likelihood of two tasks being performedwithin a time interval is nearly zero. We depict the set of statesand transitions in Figure 16. From any state, we either stay in thatstate (if no tasks are picked up), or move to the neighboring stateon the right (if one task is picked up — recall that since our timegranularity is set to be as small so that we do not ever end up havingmore than one task picked). The costs for transitions are labeled onthe edges, and are described more in the equations below: Pr { ( n ) → ( n − | c n } = e − λp ( c n ) λp ( c n ) cost { ( n ) → ( n − | c n } = c n + α cost { ( n ) → ( n ) | c n } = α These equations are simply counterparts to the equations 5 and 7in Section 3, with the additional restriction that the probability oftransitioning from n to n − s where s > is 0, and the fact that thecost of transitioning back to the same state is α , i.e., α × latency,which is 1, and the cost of transitioning to a neighboring state is c n + α × latency = c n + α . Now, following the dynamic program-ming recipe from Section 3, Opt ( n ) = min c [ Opt ( n −
1) + c + α ] × Pr { ( n ) → ( n − | c } + [ Opt ( n ) + α ] × Pr { ( n ) → ( n ) | c } Here, given the price c that is set at state n , either one task is com-pleted by the next time interval, or no task is completed, whichaccounts for the two quantities within the minimization. In the lat-ter case, we stay at the same state. Note here that we need to, foreach c , solve for Opt ( n ) , and then pick the smallest value. Thats the best price for state ( n ) . The complexity of this procedureis dependent, once again, on the number of price choices C : thecomplexity is simply: O ( NC ) . Relaxing the Linearity Assumption:
The above techniques canbe generalized to the more realistic assumption that we made inSection 4, that the expected total latency T is linearly correlatedwith worker arrival quantity W (See Section 4.2.2 for the justifica-tion of this): E [ T | W ] = W ¯ λ where ¯ λ is the average arrival rate of workers. Now it follows that: E [ T ] = 1¯ λ E [ W ] Substituting the above equation into our objective function, we get: Q = E ( cost ) + α E ( latency )= E ( cost ) + α λ E ( worker arrival ) Optimizing this new objective function is very similar to optimizingthe original objective function: the state space is still the same,except that transitions between states are slightly different, here wehave: Pr { ( n ) → ( n − | c n } = p ( c n ) cost { ( n ) → ( n − | c n } = c n + α λ cost { ( n ) → ( n ) | c n } = α λ where each transition represents one single worker arrival event,unlike the previous scenario, where transitions happened at eachtime interval. Here, once a worker arrives, whether or not theychoose to work on our task determines if we transition to the neigh-boring state, or we stay at the same state. We can see that the statetransition diagram is the same as before, except a few coefficientsare different now. Therefore, the same dynamic programming tech-nique can also be applied here to find the optimal solution, and thecomplexity stays the same, i.e., O ( NC ) . Long-term Impact on Worker Behavior
As in any marketplace / game theoretic scenario, some workers maylearn to “game” the system as well as our dynamic pricing algo-rithm. This is inevitable. In practice, however, we expect that aslong as the majority of workers are part-time workers (which is cer-tainly true in current marketplaces), they are not likely to witnessour algorithm in action and evolve their decisions to take advantageof it. Furthermore, even if many workers know about our dynamicpricing algorithm and wish to take advantage of the system, as theprice increases, some workers may decide to work on all the re-maining tasks at that price, leaving other workers to not have anywork (and hence rethink their strategy). As long as the workers inthe marketplace are not cooperating with each other, we expect thesystem to achieve certain equilibrium in the end. Lastly, we expectour pricing techniques to be updated once in a while to reflect thecurrent state of the marketplace—for instance, if workers no longerpick up $0.1 tasks, we may want to offer our minimum price as$0.2.
7. RELATED WORK
The prior work related to ours can be placed in a few categories;we describe each of them in turn:
Pricing Schemes:
Faridani [17] develop models for marketplacedynamics that we leverage in this paper. They also develop staticpricing strategies that we compare against. To the best of our knowl-edge, there has been no work on optimizing price apart from [17].
Control Theory:
Recent work has leveraged decision theory forimproving cost and quality in simple crowdsourcing workflows,typically using POMDPs (Partially Observable MDPs): Dan Weld’sgroup has designed strategies to dynamically choose the best deci-sion to make at any step in the workflow (refine, improve, vote, orstop), and also to dynamically switch between workflows to im-prove the overall “utility” [13, 14, 27, 28]. Kamar et al. [24] usePOMDPs to study how to best utilize participation in voluntarycrowdsourcing systems, specifically, Galaxy Zoo, an astronomicaldata set verified by human workers. The papers mentioned abovedo not provide theoretical guarantees. Our prior work also usesdecision theory for getting guarantees on cost and accuracy for fil-tering [36, 37]. None of these prior papers study the problem ofdetermining optimal pricing for tasks over time: all of them as-sume that each task has a fixed price or reward, and optimize theset of tasks to meet accuracy guarantees.
Crowd Algorithms:
There has been a lot of recent activity cen-tered around designing data processing algorithms where the unitoperations are performed by human workers, such as filtering [37],sorting and joins [21, 31], top- k [16], deduplication and cluster-ing [20, 47] and categorization [40]. None of these papers explorethe problem of pricing tasks to complete on time.Of these papers, just categorization [40] and filtering [36, 37]consider the aspect of latency, and there too, they use number ofround-trips as a proxy for latency rather than the true elapsed time. Error Estimation:
There has been significant work on simultane-ous estimation of answers to tasks and errors of workers using theEM algorithm or other local optimization techniques. There havebeen a number of papers studying increasingly expressive modelsfor this problem, including difficulty of tasks and worker exper-tise [29, 45, 49], adversarial behavior [43], and online evaluationof workers [30, 42, 48]. There has also been work on choosingworkers for evaluating different items so as to reduce overall errorrate [35, 46]. Recent work has also tried to obtain theoretical guar-antees for both worker error estimates as well as correct labels foritems [15, 19, 22, 25]. Our work on pricing tasks is orthogonal tothis line of work, and can be combined with any of these schemesto better price a batch of tasks to complete by a given deadline.
Applications:
There are a number of useful applications of crowd-sourcing, such as sentiment analysis [41], identifying spam [34],determining search relevance [9], and translation [50].
8. CONCLUSIONS
In this paper, we developed algorithms to optimally set and varythe price for human computation tasks in a crowdsourcing market-place to meet latency and cost constraints. For a monetary bud-get scenario, we demonstrated that static pricing strategies lead tooptimal completion times, and developed efficient algorithms tofind near-optimal pricing strategies. For a fixed deadline scenario,we demonstrated that our techniques, based on MDPs, outperformfixed pricing strategies by up to 30% on simulations based on real-world crowdsourcing marketplace data and live experiments, andare more robust to errors in estimates of marketplace parametersand predictions of future trends. Our techniques can be profitablyemployed in scenarios demanding the repeated use of crowdsourc-ing on a large scale, wherein the cost savings will be massive.
9. REFERENCES
1] Crowdsourced data analysis with Clockwork Raven. https://blog.twitter.com/2012/crowdsourced-data-analysis-with-clockwork-raven .[2] Crowdsourcing Insights from eBay (Retrieved 10 January 2014). http://crowdopolis.info/james-rubinstein.pdf .[3] Mechanical Turk (Retrieved 22 July 2013). .[4] Mechanical Turk Tracker (Retrieved 20 February 2014). http://mturk-tracker.com .[5] Microsoft Wants to Turn Crowdsourcing from an Art to a Science (Retrieved 10January 2014). .[6] Samasource Annual Report (Retrieved 10 January 2014). .[7] Translators Wanted at LinkedIn. The Pay? 0 an Hour. The New York Times(Retrieved 10 January 2014). .[8] P. W. adn Shivaram Lingamneni, D. Cook, J. Fennell, B. Goldenberg,D. Lubarov, D. Marin, and H. Simons. Towards building a high-qualityworkforce with mechanical turk. In
Computational Social Science and theWisdom of Crowds, NIPS Workshop , 2010.[9] O. Alonso, D. E. Rose, and B. Stewart. Crowdsourcing for relevance evaluation.
SIGIR Forum , 42(2):9–15, 2008.[10] D. P. Bertsekas.
Dynamic programming and optimal control , volume 1. AthenaScientific Belmont, MA, 1995.[11] P. Bohannon, S. Merugu, C. Yu, V. Agarwal, P. DeRose, A. S. Iyer, A. Jain,V. Kakade, M. Muralidharan, R. Ramakrishnan, and W. Shen. Purple soxextraction management system.
SIGMOD Record , 37(4):21–27, 2008.[12] S. P. Boyd and L. Vandenberghe.
Convex optimization . Cambridge universitypress, 2004.[13] N. Bruno. Minimizing database repros using language grammars. In
EDBT ,pages 382–393, 2010.[14] P. Dai, Mausam, and D. S. Weld. Decision-theoretic control of crowd-sourcedworkflows. In
AAAI , 2010.[15] N. Dalvi, A. Dasgupta, R. Kumar, and V. Rastogi. Aggregating crowdsourcedbinary ratings. In
WWW , pages 285–294, 2013.[16] S. B. Davidson, S. Khanna, T. Milo, and S. Roy. Using the crowd for top-k andgroup-by queries. ICDT ’13, pages 225–236, 2013.[17] S. Faridani, B. Hartmann, and P. G. Ipeirotis. What’s the right price? pricingtasks for finishing on time. In
Proceedings of HCOMP11: The 3rd Workshop onHuman Computation , 2011.[18] M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin. Crowddb:answering queries with crowdsourcing. In
SIGMOD , 2011.[19] A. Ghosh, S. Kale, and P. McAfee. Who moderates the moderators?crowdsourcing abuse detection in user-generated content. In EC , pages167–176, 2011.[20] R. Gomes, P. Welinder, A. Krause, and P. Perona. Crowdclustering. In NIPS ,pages 558–566, 2011.[21] S. Guo, A. Parameswaran, and H. Garcia-Molina. So who won?: dynamic maxdiscovery with the crowd. In
SIGMOD Conference , pages 385–396, 2012.[22] M. Joglekar, H. Garcia-Molina, and A. Parameswaran. Evaluating the crowdwith confidence. In
SIGKDD , 2013.[23] M. Joglekar, H. Garcia-Molina, and A. Parameswaran. Evaluating the Crowdwith Confidence . In
KDD , 2013.[24] E. Kamar, S. Hacker, and E. Horvitz. Combining human and machineintelligence in large-scale crowdsourcing. In
AAMAS , pages 467–474, 2012.[25] D. Karger, S. Oh, and D. Shah. Effcient crowdsourcing for multi-class labeling.In
SIGMETRICS , pages 81–92, 2013.[26] D. R. Karger, S. Oh, and D. Shah. Budget-optimal task allocation for reliablecrowdsourcing systems.
CoRR , abs/1110.3564, 2011.[27] C. H. Lin, Mausam, and D. S. Weld. Crowdsourcing control: Moving beyondmultiple choice. In
UAI , pages 491–500, 2012.[28] C. H. Lin, Mausam, and D. S. Weld. Dynamically switching betweensynergistic workflows for crowdsourcing. In
AAAI , 2012.[29] Q. Liu, J. Peng, and A. Ihler. Variational inference for crowdsourcing. In
NIPS ,pages 701–709, 2012.[30] X. Liu, M. Lu, B. C. Ooi, Y. Shen, S. Wu, and M. Zhang. Cdas: acrowdsourcing data analytics system.
Proc. VLDB Endow. , 5(10):1040–1051,June 2012.[31] A. Marcus, E. Wu, D. Karger, S. Madden, and R. Miller. Human-powered sortsand joins. In
VLDB , 2012.[32] W. A. Massey, G. A. Parker, and W. Whitt. Estimating the parameters of anonhomogeneous poisson process with linear rate.
Telecommunication Systems ,5(2):361–388, 1996.[33] D. McFadden. Conditional logit analysis of qualitative choice behavior. 1973.[34] M. Motoyama, K. Levchenko, C. Kanich, D. McCoy, G. M. Voelker, andS. Savage. Re: Captchas-understanding captcha-solving services in aneconomic context. In
USENIX Security Symposium , pages 435–462, 2010.[35] P. Donmez et al. Efficiently learning the accuracy of labeling sources forselective sampling. In
KDD , pages 259–268, 2009.[36] A. Parameswaran, S. Boyd, H. Garcia-Molina, A. Gupta, N. Polyzotis, andJ. Widom. Optimal Crowd-Powered Rating and Filtering Algorithms. Technical report, Stanford University, 2013.[37] A. Parameswaran, H. Garcia-Molina, H. Park, N. Polyzotis, A. Ramesh, andJ. Widom. CrowdScreen: algorithms for filtering data with humans. In
SIGMOD Conference , pages 361–372, 2012.[38] A. Parameswaran, H. Park, H. Garcia-Molina, N. Polyzotis, and J. Widom.Deco: declarative crowdsourcing. In
CIKM , pages 1203–1212, 2012.[39] A. Parameswaran and N. Polyzotis. Answering Queries using Humans,Algorithms and Databases. In
CIDR , pages 160–166, 2011.[40] A. Parameswaran, A. D. Sarma, H. Garcia-Molina, N. Polyzotis, and J. Widom.Human-assisted graph search: it’s okay to ask questions.
PVLDB ,4(5):267–278, 2011.[41] R. Snow et al. Cheap and fast - but is it good? evaluating non-expertannotations for natural language tasks. In
EMNLP , pages 254–263, 2008.[42] A. Ramesh, A. Parameswaran, H. Garcia-Molina, and N. Polyzotis. Identifyingreliable workers swiftly. Technical report, Stanford University, September 2012.[43] V. C. Raykar and S. Yu. Eliminating spammers and ranking annotators forcrowdsourced labeling tasks.
Journal of Machine Learning Research ,13:491–518, 2012.[44] S. M. Ross.
Stochastic processes , volume 2. John Wiley New York, 1996.[45] V. C. Raykar et al. Supervised learning from multiple experts: whom to trustwhen everyone lies a bit. In
ICML , page 112, 2009.[46] V. S. Sheng et al. Get another label? improving data quality and data miningusing multiple, noisy labelers. In
KDD , pages 614–622, 2008.[47] J. Wang, T. Kraska, M. J. Franklin, and J. Feng. Crowder: Crowdsourcing entityresolution.
PVLDB , 5(11):1483–1494, 2012.[48] P. Welinder and P. Perona. Online crowdsourcing: rating annotators andobtaining cost-effective labels. In
CVPR , 2010.[49] J. Whitehill, P. Ruvolo, T. Wu, J. Bergsma, and J. R. Movellan. Whose voteshould count more: Optimal integration of labels from labelers of unknownexpertise. In
NIPS , pages 2035–2043. 2009.[50] O. Zaidan and C. Callison-Burch. Feasibility of human-in-the-loop minimumerror rate training. In