Collective Schedules: Scheduling Meets Computational Social Choice
CCollective Schedules:Scheduling Meets Computational Social Choice
Fanny PascualSorbonne Universit´e, CNRS, LIP6Paris, France [email protected]
Krzysztof RzadcaUniversity of WarsawWarsaw, Poland [email protected]
Piotr SkowronUniversity of WarsawWarsaw, Poland [email protected]
March 21, 2018
Abstract
When scheduling public works or events in a shared facility one needs to accommodate pref-erences of a population. We formalize this problem by introducing the notion of a collectiveschedule . We show how to extend fundamental tools from social choice theory—positional scor-ing rules, the Kemeny rule and the Condorcet principle—to collective scheduling. We study thecomputational complexity of finding collective schedules. We also experimentally demonstratethat optimal collective schedules can be found for instances with realistic sizes.
Major public infrastructure projects, such as extending the city subway system, are often phased.As workforce, machines and yearly budgets are limited, phases have to be developed one by one.Some phases are inherently longer-lasting than others. Moreover, individual citizens have differentpreferred orders of phases. Should the construction start with a long phase with a strong support,or rather a less popular phase, that, however, will be finished faster? If the long phase starts first,the citizens supporting the short phase would have to wait significantly longer. Consider anotherexample: planning events in a single lecture theater for a large, varied audience. The theater needsto be shared among different groups. Some events last just a few hours, while others multipledays. What is the optimal schedule? We formalize these and similar questions by introducing thenotion of a collective schedule , a plan that takes into account both jobs’ durations and their societalsupport. The central idea stems from the observation that the problem of finding a socially optimalcollective schedule is closely related to the problem of aggregating agents’ preferences, one of thecentral problems studied in social choice theory [3]. However, differences in jobs’ lengths have tobe explicitly considered. Let us illustrate these similarities through the following example.Consider a collection of jobs all having the same duration. The jobs have to be processedsequentially (one by one). Different agents might have different preferred schedules of processingthese jobs. Since each agent would like all the jobs to be executed as soon as possible, the preferredschedule of each agent does not contain “gaps” (idle times), and so, such a preferred schedule can be1 a r X i v : . [ c s . G T ] M a r iewed as an order over the set of jobs, and can be interpreted as a preference relation. Similarly,the resulting collective schedule can be viewed as an aggregated preference relation. From thisperspective, it is natural to apply tools from social choice theory to find a socially desired collectiveschedule.Yet, the tools of social choice cannot be always applied directly. The scheduling model istypically much richer, and contains additional elements. In particular, when jobs’ durations vastlydiffer, these differences must be taken into account when constructing a collective schedule. Forinstance, imagine that we are dealing with two jobs—one very short, J s , and one very long, J l .Further, imagine that 55% of the population prefers the long job to be executed first and thatthe remaining 45% has exactly opposite preferences. If we disregard the jobs’ durations, thenperhaps every decision maker would schedule J l before J s . However, starting with J s affects 55%of population just slightly (as J l is just slightly delayed compared to their preferred schedules). Incontrast, starting with J l affects 45% of population significantly (as J s is severely delayed). We explore the following question: How can we meaningfully apply the classic tools from socialchoice theory to find a collective schedule? The key idea behind this work is to use fundamentalconcepts from both fields to highlight the new perspectives.Scheduling offers an impressive collection of models, tools and algorithms which can be appliedto a broad class of problems. It is impossible to cover all of them in a single work. We use perhapsthe most fundamental (although still non-trivial) scheduling model: a single processor executinga set of independent jobs. This model is already rich enough to describe significant real-worldproblems (such as the public works or the lecture theater introduced earlier). At the same time,such a model, fundamental, well-studied and stripped from orthogonal issues, enables us to highlightthe new elements brought by social choice.Similarly, we focus on three well-known and extensively studied tools from social choice theory:positional scoring rules, the Kemeny rule and the Condorcet principle. Under a positional scoringrule the score that an object receives from an agent is derived only on the basis of the position ofthis object in the agent’s preference ranking; the objects are then ranked in the descending orderof their total scores received from all the agents. The
Kemeny rule uses the concept of distancesbetween rankings. It selects a ranking which minimizes the sum of the swap distances to thepreference rankings of all the agents. The
Condorcet principle states that if there exists an objectthat is preferred to any other object by the majority of agents, then this object should be put onthe top of the aggregated ranking. The Condorcet principle can be generalized to the remainingranking positions. Assume that the graph of the preferences of the majority of agents is acyclic, i.e.,there exists no such a sequence of objects o , . . . , o (cid:96) that o is preferred by the majority of agentsto o , o to o , . . . , o (cid:96) − to o (cid:96) and o (cid:96) to o . Whenever an object o is preferred by the majority ofagents to another object q , o should be put before q in the aggregated ranking.Naturally, these three notions can be directly applied to find a collective schedule. Yet, as weargued in our example with a long and a short job, this can lead to intuitively suboptimal schedules,because they do not consider significantly different processing times. We propose extensions of thesetools to take into account lengths of the jobs. We also analyze their computational complexity. Scheduling:
The two most related scheduling models apply concepts from game theory and mul-tiagent optimization. The selfish job model [18, 27] assumes that each job has a single owner trying2o minimize its completion time and that the jobs compete for processors. The multi-organizationalmodel [11] assumes that a single organization owns and cares about multiple jobs. Our work com-plements these with a third perspective: not only each job has multiple “owners”, but also theycare about all jobs (albeit to a different degree).In multiagent scheduling [2], agents have different optimization goals (e.g., different functions orweights). The system’s objective is to find all Pareto-optimal schedules, or a single Pareto-optimalschedule (optimizing one agent’s goal with constraints on admissible values for other goals). Incontrast, our aim is to propose rules allowing to construct a single, compromise schedule. Thiscompromise stems from social choice methods and tools. Moreover, our setting is motivated byproblems in which the number of agents is large. To the best of our knowledge, the existingliterature on multiagent scheduling focuses on cases with a few (e.g. two) agents.
Computational social choice:
For an overview of tools and methods for aggregating agents’preferences see the book of Arrow et al. [3]. Fischer et al. [15] overview the computational com-plexity of finding Kemeny rankings. Caragiannis et al. [7] discuss computational complexity offinding winners according to a number of Condorcet-consistent methods.Typically in social choice, an aggregated ranking is created to establish the collective preferencerelation, and to eventually select a single best alternative (sometimes with a few runner-ups).Thus, the agents usually do not care what is the order of the candidates in the further part ofthe collective ranking. In our model the agents are interested in the whole output rankings. Wecan thus implement fairness—the agents who are dissatisfied with an order in the beginning of acollective schedule might be compensated in the further part of the schedule. Thus, our approachis closer to the recent works of Skowron et al. [26] and Celis et al. [8] analyzing fairness of collectiverankings.In participatory budgeting [6, 16, 24, 13, 4] agents express preferences over projects which havedifferent costs. The goal is to choose a socially-optimal set of items with a total cost not exceedingthe budget. Thus, in a way, participatory budgeting extends the knapsack problem similarly tohow we extend scheduling.
We use standard scheduling notations and definitions from the book of Brucker [5], unless otherwisestated. For each integer t , by [ t ] we denote the set { , . . . , t } . Let N = [ n ] be the set of n agents(voters) and let J = { J , . . . , J m } be the set of m jobs (note that in scheduling m is typicallyused to denote the number of machines; we deliberately abuse this notation as our results are fora single machine). For a job J i by p i ∈ N we denote its processing time (also called duration orsize), i.e., the number of time units J i requires to be completed. We consider an off-line problem,i.e., jobs J are known in advance. Jobs are ready to be processed (there are no release dates). Foreach job J i its processing time p i is known in advance ( clairvoyance , a standard assumption in thescheduling theory). Once started, a job cannot be interrupted until it completes (we do not allowfor preemption of the jobs).There is a single machine that executes all the jobs. A schedule σ : J → N is a function thatassigns to each job J i its start time σ ( J i ), such that no two jobs J k , J (cid:96) execute simultaneously.Thus, either σ ( J k ) ≥ σ ( J (cid:96) ) + p (cid:96) or σ ( J (cid:96) ) ≥ σ ( J k ) + p k . By C i ( σ ) we denote the completion time ofjob J i : C i ( σ ) = σ ( J i ) + p i . We assume that a schedule has no gaps: for each job i , except the jobthat completes as the last one, there exists job j such that C i ( σ ) = σ ( J j ). Let S denote the set ofall possible schedules for the set of jobs J . 3ach agent wants all jobs to be completed as soon as possible, yet agents differ in their views onthe relative importance of the jobs. We assume that each agent a has a certain preferred schedule σ a ∈ J , and when building σ a , an agent is aware of the processing times of the jobs. In particular, σ a does not have to directly correspond to the relative importance of jobs. For instance, if in σ a ashort job J s precedes a long job J (cid:96) , then this does not necessarily mean that a considers J s moreimportant than J (cid:96) . a might consider J (cid:96) more important, but she might prefer a marginally lessimportant job J s to be completed sooner as it would delay J (cid:96) only a bit.A schedule can be encoded as a (transitive, asymmetric) binary relation: J i σ a J k ⇔ σ a ( J i ) <σ a ( J k ). E.g., J σ a J σ a . . . σ a J m means that agent a wants J to be processed first, J second,and so on. We will denote such a schedule as ( J , J , . . . , J m ).We call a vector of preferred schedules, one for each agent, a preference profile . By P we denotethe set of all preference profiles of the agents. A scheduling rule R : P → S is a function whichtakes a preference profile as an input and returns a collective schedule.In the remaining part of this section we propose different methods in which the preference profileis used to evaluate a proposed collective schedule σ (and thus, to construct a scheduling rule R ).All the proposed methods extrapolate information from σ a (a preferred schedule) to evaluate σ .Such an extrapolation is common in social choice: in participatory budgeting it is typical to askeach agent to provide a single set of items [6, 16, 24, 4] (instead of preferences over sets of items);similarly in multiwinner elections, each agent provides separable preferences of candidates [25, 14].Alternatively, we could ask an agent to express her preferences over all possible schedules. Thisapproach is also common in other areas of social choice (e.g., in voting in combinatorial domainsmodel [19]), yet it requires eliciting exponential information from the agents. There exist alsomiddle ground approaches, using specifically designed languages, such as CP-nets, for expressingpreferences. In the classic social choice, positional scoring rules are perhaps the most straightforward, andthe most commonly used in practice, tools to aggregate agents’ preferences. Informally, under apositional scoring rule each agent a assigns a score to each candidate c (a job, in our case), whichdepends only on the position of c in a ’s preference ranking. For each candidate the scores that shereceives from all the agents are summed up, and the candidates are ranked in the descending orderof their total scores.There is a natural way to adapt this concept. For an increasing function h : N → R and a job J we define the h -score of J as the total duration of jobs scheduled after J in all preferred schedules: h -score( J ) = (cid:88) a ∈ N f (cid:88) J i : J σ a J i p i .The h -psf-rule (psf for positional scoring function) schedules the jobs by their descending h -scores. If jobs are unit-size ( p i = 1), then h -score( J ) is simply the score that J would get from theclassic positional scoring rule induced by h . For an identity function h id ( x ) = x , the h id -psf-rulecorresponds to the Borda voting method adapted to collective scheduling.The so-defined scheduling methods differ from traditional positional scoring rules, by takinginto account the processing times of the jobs:1. A score that a job J receives from an agent a depends on the total processing time rather thanon the number of jobs that J precedes in schedule σ a .4. When scoring a job J we sum the duration of jobs scheduled after J , rather than before it. Thisimplicitly favors jobs with lower processing times. Indeed, consider two preferred schedules, σ and τ identical until time t , at which a long job J (cid:96) is scheduled in σ , and a short job J s isscheduled in τ . Since J s is shorter, the total size of the jobs succeeding J s in τ is larger thanthe total size of the jobs succeeding J (cid:96) in σ . Consequently, J s gets a higher score from τ than J (cid:96) gets from σ .However, this implicit preference for short jobs seems insufficient, as illustrated by the followingexample. Example 1.
Consider three jobs, J (cid:96), , J (cid:96), , J s , with the processing times (cid:96) , (cid:96) , and , respectively.Assume that (cid:96) (cid:29) , and consider the following preferred schedules of agents: n / + (cid:15) of agents : J (cid:96), σ J (cid:96), σ J s n / + (cid:15) of agents : J (cid:96), σ J (cid:96), σ J sn / − (cid:15) of agents : J s σ J (cid:96), σ J (cid:96), n / − (cid:15) of agents : J s σ J (cid:96), σ J (cid:96), By h id -psf-rule, J (cid:96), and J (cid:96), are scheduled before J s . However, starting with J s would delay J (cid:96), and J (cid:96), by only one time unit, while starting with J (cid:96), and J (cid:96), delays J s by (cid:96) , an arbitrarily largevalue. Moreover, J s is put first by roughly / of agents, a significant fraction. Example 1 demonstrates that the pure social choice theory does not offer tools appropriatefor collective scheduling (we will provide more arguments to support this statement throughoutthe text). To address such issues we propose an approach that builds upon social choice and thescheduling theory.
A cost function quantifies how a given schedule τ differs from an agent’s preferred schedule σ . Inthis section, we adapt to our model classic costs used in scheduling and in social choice. We thenshow how to aggregate these costs among agents in order to produce a single measure of a qualityof a schedule. This approach allows us to construct a family of scheduling methods that, in somesense, extend the classic Kemeny rule.Formally, a cost function f maps a pair of schedules, τ and σ , to a non-negative real value. Weanalyze the following cost functions. Below, τ denotes a collective schedule the quality of which wewant to assess; while σ denotes the preferred schedule of a single agent. These functions take into account only the orders of jobs in the two schedules (ignoring the pro-cessing times), thus directly correspond to costs from social choice.1. The Kendall [17] tau (or swap) distance ( K ), measures the number of swaps of adjacent jobs toturn one schedule into another one. We use an equivalent definition that counts all pairs of jobsexecuted in a non-preferred order: K ( τ, σ ) = (cid:12)(cid:12)(cid:12)(cid:8) ( k, (cid:96) ) : J k τ J (cid:96) and J (cid:96) σ J k (cid:9)(cid:12)(cid:12)(cid:12) .5. Spearman distance ( S ). Let pos( J, π ) denote the position of job J in a schedule π , i.e., thenumber of jobs scheduled before J in π . The Spearman distance is defined as: S ( τ, σ ) = (cid:80) J ∈J (cid:12)(cid:12) pos( J, σ ) − pos( J, τ ) (cid:12)(cid:12) . These functions use the completion times { C i ( σ ) : J i ∈ J } of jobs in the preferred schedule σ (andthus, indirectly, jobs’ lengths). The completion times form jobs’ due dates, d i = C i ( σ ). A delaycost then quantifies how far are the proposed completion times { c i = C i ( τ ) : J i ∈ J } from theirdue dates { d i } by one of the six classic criteria defined in Brucker [5]: Tardiness (T) T ( c i , d i ) = max(0 , c i − d i ). Unit penalties (U) how many jobs are late: U ( c i , d i ) = (cid:40) c i > d i Lateness (L) is similar to tardiness, but includes a bonus for being early: L ( c i , d i ) = c i − d i . Earliness (E) E ( c i , d i ) = max(0 , d i − c i ). Absolute deviation (D) D ( c i , d i ) = | c i − d i | . Squared deviation (SD) SD ( c i , d i ) = ( c i − d i ) .Each such a criterion f ∈ { T, U, L, E, D, SD } naturally induces the corresponding delay cost ofan agent, f ( τ, σ ): f ( τ, σ ) = (cid:88) J i ∈J f (cid:16) C i ( τ ) , C i ( σ ) (cid:17) .In this work, we mostly focus on the tardiness T , which is both easy to interpret for ourmotivating examples and the most extensively studied in scheduling. However, there is interest tostudy the remaining functions as well. U and L are similar to T —the sooner a task is completed, thebetter. The remaining three measures ( E, S , and SD ) penalize the jobs which are executed beforetheir “preferred times”. However, each job when executed earlier makes other jobs executed later(e.g., after their due times). Thus, these penalties quantify the unnecessary (wasted) promotion ofjobs executed too early (causing other jobs being executed too late). By restricting the instances to unit-size jobs, we can relate delay and swap costs. The Spearmandistance S has the same value as the absolute deviation D (by definition), and twice that of T : Proposition 1.
For unit-size jobs it holds that S ( σ, τ ) = 2 T ( σ, τ ) , for all schedules σ, τ . The considered metrics have their natural interpretations also in other more specific settings. E.g., the earliness E is useful if each task represents a (collective) work to be done by the agents (workers) and when agents do notwant to work before their preferred start times. Similarly, D and SD can be used when an agent wants each task tobe executed exactly at the preferred time. roof. Observe that for unit-size jobs the tardiness measure can be expressed as: T ( τ, σ ) = (cid:88) J : pos( J,τ ) > pos( J,σ ) (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) ,Since (cid:80) J pos( J, τ ) = (cid:80) J pos( J, σ ) we get that:0 = (cid:88) J (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) = (cid:88) J : pos( J,τ ) > pos( J,σ ) (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) + (cid:88) J : pos( J,τ ) < pos( J,σ ) (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) ++ (cid:88) J : pos( J,τ )=pos(
J,σ ) (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) = (cid:88) J : pos( J,τ ) > pos( J,σ ) (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) + (cid:88) J : pos( J,τ ) < pos( J,σ ) (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) .Thus: (cid:88) J : pos( J,τ ) > pos( J,σ ) (cid:12)(cid:12)(cid:12) pos( J, τ ) − pos( J, σ ) (cid:12)(cid:12)(cid:12) = (cid:88) J : pos( J,τ ) < pos( J,σ ) (cid:12)(cid:12)(cid:12) pos( J, τ ) − pos( J, σ ) (cid:12)(cid:12)(cid:12) .And, consequently: S ( τ, σ ) = (cid:88) J (cid:12)(cid:12)(cid:12) pos( J, τ ) − pos( J, σ ) (cid:12)(cid:12)(cid:12) = (cid:88) J : pos( J,τ ) > pos( J,σ ) (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) ++ (cid:88) J : pos( J,τ ) < pos( J,σ ) (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) = 2 (cid:88) J : pos( J,τ ) > pos( J,σ ) (cid:16) pos( J, τ ) − pos( J, σ ) (cid:17) = 2 T ( τ, σ ).This completes the proof.Since different agents can have different preferred schedules, in order to score a proposed sched-ule τ we need to aggregate the costs across all agents. We will consider three classic aggregations: The sum ( Σ ): (cid:80) a ∈ N f ( τ, σ a ), a utilitarian aggregation. The max: max a ∈ N f ( τ, σ a ), an egalitarian aggregation. The L p norm ( L p ): p (cid:113)(cid:80) a ∈ N (cid:0) f ( τ, σ a ) (cid:1) p , with a parameter p ≥
1. The L p norms form a spec-trum of aggregations between the sum ( L ) and the max ( L ∞ ).For a cost function f ∈ { K, S, T, U, L, E, D, SD } and an aggregation α ∈ { Σ , max , L p } , by α - f we denote a scheduling rule returning a schedule that minimizes the α -aggregation of the f -costsof the agents. In particular, for unit-size jobs the Σ- T rule is equivalent to Σ- S and to Σ- D , andΣ- K is simply the Kemeny rule. 7cheduling based on cost functions avoids the problems exposed by Example 1 (indeed for thatinstance, e.g., the Σ- T rule starts with the short job J s ). Additionally, these methods satisfy somenaturally-appealing axiomatic properties, such as reinforcement, which is a particularly naturalrequirement in our case. Definition 1 (Reinforcement) . A scheduling rule R satisfies reinforcement iff for any two groupsof agents N and N , a schedule σ is selected by R both for N and for N , then it should be alsoselected for the joint instance N ∪ N . Proposition 2.
All Σ - f scheduling rules satisfy reinforcement. In the previous section we introduced several scheduling rules, all based on the notion of a distancebetween schedules. Thus, these scheduling rules are closely related to the Kemeny voting system.We now take a different approach. We start from desired properties of a collective schedule anddesign scheduling rules satisfying them.Pareto efficiency is one of the most accepted axioms in social choice theory. Below we use aformulation analogous to the one used in voting theory (based on swaps in preferred schedules).
Definition 2 (Pareto efficiency) . A scheduling rule R satisfies Pareto efficiency iff for each pairof jobs, J k and J (cid:96) , and for each preference profile σ = ( σ , . . . , σ n ) ∈ P such that for each a ∈ N we have J k σ a J (cid:96) , it holds that J k R ( σ ) J (cid:96) . In other words, if all agents prefer J k to be scheduled before J (cid:96) , then in the collective schedule J k should be before J (cid:96) . Curiously, the total tardiness Σ- T rule does not satisfy Pareto efficiency: Example 2.
Consider an instance with 3 jobs J , J , J with lengths 20, 5, and 1, respectively, andwith two agents having preferred schedules σ a = ( J , J , J ) and σ b = ( J , J , J ) . Both agents prefer J to be scheduled before J . If our scheduling rule satisfied Pareto efficiency, then it would pickone of the following three schedules: ( J , J , J ) , ( J , J , J ) , or ( J , J , J ) . The total tardinessesof these schedules are equal to: 21, 25, and 10, respectively. Yet, the total tardiness of the schedule ( J , J , J ) is equal to 7. This example can be generalized to inapproximability:
Proposition 3.
For any α > , there is no scheduling rule that satisfies Pareto efficiency and is α -approximate for max - T or Σ - T .Proof. Let us assume, towards a contradiction, that there exists a scheduling rule R that satisfiesPareto efficiency and is α -approximate for minimizing Σ- T (the proof for max- T is analogous).Let x = (cid:100) α (cid:101) . Consider an instance with x + 2 jobs: one job J of length x , one job J oflength x , and x jobs J , . . . , J x +2 of length 1. Let us consider two agents with preferred schedules σ = ( J , J , . . . , J x +2 , J ) and σ = ( J , J , J , . . . , J x +2 ). For each i ∈ { , . . . , x + 2 } , both agentsprefer job J to be scheduled before job J i . Let τ be the schedule returned by R . Since R satisfiesPareto efficiency, for each i ∈ { , . . . , x + 2 } , J is scheduled before job J i in τ . Thus τ is either σ ,or a schedule where J is scheduled first, followed by i jobs of length 1 ( i ∈ { , . . . , x } ), followed by J , followed by the x − i remaining jobs of length 1. Let S i be such a schedule. In S i , the tardinessof job J is x + i (this job is in first position in σ ), and the tardiness of the jobs of length 1 is8 x − i ) x (the x − i last jobs in S i are scheduled before J in σ ). Thus the total tardiness of S i is ( x + i ) + ( x − i ) x ≥ x + x . The total tardiness of schedule σ is x + x (each of the x jobs J , J , . . . , J x +2 in σ finishes x time units later than in σ ). Thus, the total tardiness of τ is atleast x + x . Let us now consider schedule τ (cid:48) , which does not satisfy Pareto efficiency, and which isas follows: job J is scheduled first, followed by the jobs of length 1, followed by job J . The totaltardiness of this schedule is 3 x (the only job which is delayed compared to σ and σ is job J ).This schedule is optimal for Σ- T . Thus the approximation ratio of R is at least x + x x = x +13 > α .Therefore, R is not α -approximate for Σ- T , a contradiction. Proposition 4.
If all jobs are unit-size, the scheduling rule (cid:80) - T is Pareto efficient.Proof. Let us assume that there exist two jobs which are not in a Pareto order in the schedule σ optimizing (cid:80) T . We can swap these jobs in σ and it is apparent that such a swap does not increasethe total tardiness of the schedule. We can perform such swaps until we reach a schedule whichdoes not violate Pareto efficiency.Pareto efficiency is one of the most fundamental properties in social choice. However, sometimes(especially in our setting) there exist reasons for violating it. For instance, even if all the agentsagree that J x should be scheduled before J y , the preferences of the agents with respect to otherjobs might differ. Breaking Pareto efficiency can help to achieve a compromise with respect tothese other jobs.Nevertheless, Proposition 3 motivated us to formulate alternative scheduling rules based onaxiomatic properties. We choose the Condorcet principle, a classic social choice property that isstronger than Pareto efficiency. We adapt it to consider the durations of jobs. Definition 3 (Processing Time Aware (PTA) Condorcet principle) . A schedule τ ∈ S is PTACondorcet consistent with a preference profile σ = ( σ , . . . , σ n ) ∈ P if for each two jobs, J k and J (cid:96) ,it holds that J k τ J (cid:96) whenever at least p k p k + p (cid:96) · n agents put J k before J (cid:96) in their preferred schedule.A scheduling rule R satisfies the PTA Condorcet principle if for each preference profile it returnsa PTA Condorcet consistent schedule, whenever such exists. Let us explain our motivation for ratio p k p k + p (cid:96) . Consider a schedule τ and two jobs, J k and J (cid:96) ,scheduled consecutively in τ . By N k we denote the set of agents who rank J k before J (cid:96) in theirpreferred schedules, and let us assume that | N k | > p k p k + p (cid:96) n ; we set N (cid:96) = N − N k . Observe that ifwe swapped J k and J (cid:96) in τ , then each agent from N k would be disappointed. Since such a swapmakes J k scheduled p (cid:96) time units later than in τ , the level of dissatisfaction of each agent from N k could be quantified by p (cid:96) . Thus, their total (utilitarian) dissatisfaction dis( N k ) could be quantifiedby | N k | · p (cid:96) . By an analogous argument, if we started with a schedule where J (cid:96) is put right before J k , and swapped these jobs, then the total dissatisfaction of agents from N (cid:96) could be quantified by:dis( N (cid:96) ) = | N (cid:96) | p k < (cid:18) n − p k p k + p (cid:96) n (cid:19) p k = n · p k p (cid:96) p k + p (cid:96) < | N k | · p (cid:96) = dis( N k ).Thus, the total dissatisfaction of all agents from scheduling J k before J (cid:96) is smaller than that fromscheduling J (cid:96) before J k . Definition 3 requires that in such case J k should be indeed scheduledbefore J (cid:96) .Proposition 5 below highlights the difference between scheduling based on the tardiness and onthe PTA Condorcet principle. 9 roposition 5. Even if all jobs are unit-size, the (cid:80) - T rule does not satisfy the PTA Condorcetprinciple.Proof. Consider an instance with three jobs and three agents with the following preferred schedules: σ = ( J , J , J ); σ = ( J , J , J ); σ = ( J , J , J ); σ = ( J , J , J ); σ = ( J , J , J ) . The only PTA Condorcet consistent schedule is ( J , J , J ) with the total tardiness of 6. At thesame time, the schedule ( J , J , J ) has the total tardiness equal to 5.To construct a PTA Condorcet consistent schedule, we propose to extend Condorcet consis-tent [9, 20] election rules to jobs with varying lengths. For example, we obtain: PTA Copeland’s method.
For each job J k we define the score of J k as the number of jobs J (cid:96) such that at least p k p k + p (cid:96) · n agents put J k before J (cid:96) in their preferred schedule. The jobs arescheduled in the descending order of their scores. Iterative PTA Minimax.
For each pair of jobs, J k and J (cid:96) , we define the defeat score of J k against J (cid:96) as max(0 , p k p k + p (cid:96) n − n k ), where n k is the number of agents who put J k before J (cid:96) intheir preferred schedule. We define the defeat score of J k as the highest defeat score of J k against any other job. The job with the lowest defeat score is scheduled first. Next, we removethis job from the preferences of the agents, and repeat (until there are no jobs left).Other Condorcet consistent election rules, such as the Dogdson’s rule or the Tideman’s ranked pairsmethod, can be adapted similarly. It is apparent that they satisfy the PTA Condorcet principle.PTA Condorcet consistency comes at a cost: e.g., the two scheduling rules violate reinforcement,even if the jobs are unit-size. Indeed, by the classic result of Young and Levenglick [28] one caninfer that any rule that satisfies PTA-Condorcet principle, neutrality, and reinforcement must bea generalization of the Kemeny rule (i.e., must be equivalent to the Kemeny rule if the processingtimes of the jobs are equal). We conjecture that rules satisfying neutrality and reinforcement fail thePTA-Condorcet principle; it is an interesting open question whether such an impossibility theoremholds. In this section we study the computational complexity of finding collective schedules according tothe previously defined rules. We start from the simple observation about the two PTA Condorcetconsistent rules that we defined in the previous section.
Proposition 6.
The PTA Copeland’s method and the iterative PTA minimax rule are computablein polynomial time.
We further observe that computational complexity of the rules which ignore the lengths of thejobs (rules based on swap costs) can be directly inferred from the known results from computationalsocial choice. For instance, the Σ- K rule is simply the well-known and extensively studied Kemenyrule. Thus, in the further part of this section we focus on the rules based on delay costs.10 .1 Sum of Delay Costs First, observe that the problem of finding a collective schedule is computationally easy for the totallateness (Σ- L ). In fact, Σ- L ignores the preferred schedules of the agents and arranges the jobsfrom the shortest to the longest one. Proposition 7.
The rule Σ - L schedules the jobs in the ascending order of their lengths.Proof. Consider the total cost of the agents: (cid:88) a ∈ N L ( τ, σ a ) = (cid:88) a ∈ N (cid:88) J i ∈J ( C i ( τ ) − C i ( σ a )) = | N | (cid:88) J i ∈J C i ( τ ) − (cid:88) a ∈ N (cid:88) J i ∈J C i ( σ a ).Thus, the total cost of the agents is minimized when (cid:80) J i ∈J C i ( τ ) is minimal. This value is minimalwhen the jobs are scheduled from the shortest to the longest one.On the other hand, minimizing the total tardiness Σ- T is NP-hard even with the unary repre-sentation of the durations of jobs. Du and Leung [10] show that minimizing total tardiness with arbitrary due dates on a single processor (1 || (cid:80) T i ) is weakly NP-hard. We cannot use this resultdirectly as the due dates in our problem Σ- T are structured and depend, among others, on jobs’durations. Theorem 8.
The problem of finding a collective schedule minimizing the total tardiness ( Σ - T ) isstrongly NP -hard.Proof. We reduce from the strongly NP-hard problem. Let I be an instance of . In I we are given a multiset of integers S = { s , . . . , s µ } . We denote s Σ = (cid:80) s ∈ S s .We ask if S can be partitioned into µ triples that all have the same sum, s T = s Σ /µ . Withoutloss of generality, we can assume that µ ≥ s ∈ S , µ < s < s T (otherwise, wecan add a large constant s Σ to each integer from S , which does not change the optimal solution ofthe instance, but which ensures that µ < s < s T in the new instance). We also assume that theintegers from S are represented in unary encoding.From I we construct an instance I (cid:48) of the problem of finding a collective schedule that minimizesthe total tardiness in the following way. For each number s ∈ S we introduce 1 + sµ jobs: J s and (cid:8) P s,i,j : i ∈ [ s ] , j ∈ [ µ ] (cid:9) . We set the processing time of J s to s . Further, for each i ∈ [ s ] we set theprocessing time of P s,i, to ( s T − s ), and of the remaining j ≥ P s,i,j to s T . We denote the setof all such jobs as J S = { J s : s ∈ S } and P = (cid:8) P s,i,j : s, i ∈ [ s ] , j ∈ [ µ ] (cid:9) . Additionally, we introduce µ jobs, X = { X , . . . , X µ } , each having a unit processing time.There are s Σ agents. For each integer s ∈ S we introduce s agents. The i -th agent correspondingto number s , denoted by a s,i , has the following preferred schedule (in the notation below a set, e.g., { J s (cid:48) } denotes that its elements are scheduled in a fixed arbitrary order): (cid:16) J s , P s,i, , X , P s,i, , X , . . . , P s,i,µ , X µ , { J s (cid:48) : s (cid:48) (cid:54) = s } , (cid:8) P s (cid:48) ,j,(cid:96) : ( s (cid:48) (cid:54) = s or j (cid:54) = i ) and (cid:96) ∈ [ µ ] (cid:9)(cid:17) .We claim that the answer to the initial instance I is “yes” if and only if the schedule σ ∗ optimizing the total tardiness is the following one: (cid:16) J , X , J , X , J µ , X µ , P (cid:17) , where for each i ∈ [ µ ], J i is a set consisting of jobs from J S with lengths summing up to s T (see Figure 1). If sucha schedule exists, then the answer to I is “yes”. Below we will prove the other implication.Observe that any job from J S should be scheduled before each job from P . Indeed, for eachpair P s,i,j and J s (cid:48) only a single agent a = a s,i ranks P s,i,j before J s (cid:48) ; at the same time there exists11igure 1: The preferred schedule σ ( s,i ) of agent a s,i (top) and the optimal schedule (bottom).another agent a (cid:48) = a s (cid:48) ,k who ranks J s (cid:48) first. As J s (cid:48) is shorter than P s,i,j , a (cid:48) gains more from J s (cid:48) scheduled before P s,i,j , than a gains from P s,i,j scheduled before J s (cid:48) . Thus, if P s,i,j were scheduledbefore J s (cid:48) , we could swap these two jobs and improve the schedule (such a swap could only improvethe completion times of other jobs since J s (cid:48) is shorter than P s,i,j ).By a similar argument, any job from X should be scheduled before each job from P . Indeed, ifit was not the case, then there would exist jobs P = P s,i,j and X = X i (cid:48) such that P is scheduledright before X (this follows from the reasoning given in the previous paragraph—a job from J S cannot be scheduled after a job from P ). Also, since all the jobs from J S are scheduled before P ,the completion time of X would be at least s Σ + s T + 1 ≥ s Σ + µ + 2. For each agent, the completiontime of X in their preferred schedule is at most equal to µ ( s T + 1) = s Σ + µ . Thus, if we swap X and P the improvement of the tardiness due to scheduling X earlier would be at least equal to2 s Σ . Such a swap increases the completion time of P only by one, so the increase of the tardinessdue to scheduling P later would be at most equal to s Σ . Consequently, a swap would decrease thetotal tardiness, and so X could have not been scheduled after P in σ ∗ .We further investigate the structure of an optimal schedule σ ∗ . We know that J S σ ∗ P andthat X σ ∗ P , but we do not yet know the optimal order of jobs from J S ∪ X . Before proceedingfurther, we introduce one useful class of schedules, T , that execute jobs in the order ( J S , X , P ).Observe that σ ∗ can be constructed starting from some schedule τ ∈ T and performing a sequenceof swaps, each swap involving a job J ∈ J S and a job X ∈ X . The tardiness of σ ∗ is equal to thetardiness of the initial τ adjusted by the changes due to the swaps. Below, we further analyze T .First, any ordering of J S in τ results in the same tardiness. Indeed, consider two jobs J s and J s (cid:48) such that J s (cid:48) is scheduled right after J s . If we swap J s and J s (cid:48) , then the total tardiness of s agentsincreases by s (cid:48) and the total tardiness of s (cid:48) agents decreases by s . In effect, the total tardiness ofall agents remains unchanged. Second, there exists an optimal schedule where the relative order ofthe jobs from X is X σ ∗ X σ ∗ . . . σ ∗ X µ . Thus, w.l.o.g., we constrain T to schedules in which X are put in exactly this order.Since we have shown that all T always have the same tardiness, no matter how we arrange thejobs from J S , the tardiness of σ ∗ only depends on the change of the tardiness due to the swaps.Consider the job X , and consider what happens if we swap X with a number of jobs from J S so that eventually X is scheduled at time s T (its start time in all preferred schedules). In sucha case, moving X forward decreases the tardiness of each of s Σ agents by ( s Σ − s T ). Moving X forward to s T requires however delaying some jobs from J S . Assume that the jobs from J S with theprocessing times s i , . . . s i (cid:96) are delayed. Each such job needs to be scheduled one time unit later.Thus, the total tardiness of s i agents increases by 1 (the agents who had this job as the first intheir preferred schedule), of other s i agents increases by 1, and so on. Since s i + . . . + s i (cid:96) = s Σ − s T ,the total tardiness of all agents increases by s Σ − s T . Thus, in total, executing X at s T decreases12he total tardiness by s Σ ( s Σ − s T ) − ( s Σ − s T ), a positive number. Also, observe that this valuedoes not depend on how the jobs from J S were initially arranged, provided that X can be put sothat it starts at s T .Starting X earlier than s T does not improve the tardiness of X , yet it increases tardinessof some other jobs, so it is suboptimal. By repeating the same reasoning for X , . . . , X µ we inferthat we obtain the optimal decrease of the tardiness when X is scheduled at time s T , X at time2 s T + 1, etc., and if there are no gaps between the jobs. However, such schedule is possible toobtain if and only if the answer to the initial instance of 3-Partition is “yes”.A similar strategy (yet, with a more complex construction) can be used to prove the NP-hardnessof Σ- U . Theorem 9.
The problem of finding a collective schedule minimizing the total number of late jobs( Σ - U ) is strongly NP -hard.Proof. We give a reduction from the strongly NP-hard problem. Let I be an instanceof . In I we are given a multiset of 3 µ integers S = { s , . . . , s µ } . Similarly, as in theproof of Theorem 8, we set s Σ = (cid:80) s ∈ S s . In I we ask if S can be partitioned into µ triples that allhave the same sum, s T = s Σ /µ . We assume that for each s ∈ S , s < s T , that µ >
4, and that theintegers from S are represented in unary encoding.From I we construct an instance I (cid:48) of the problem of finding a collective schedule that minimizesthe total number of late jobs in the following way. For each number s ∈ S we introduce the followingjobs: • a job F s of length s ; • sµ jobs of length s T − s ; we denote this set as: R s = (cid:8) R s,i,j : s, i ∈ [ s ] j ∈ [ µ ] (cid:9) ; • µ ( µ − s jobs of length s T ; we denote this set as: P s = (cid:8) P s,i,j,k : s, i ∈ [ s ] , j ∈ [ µ ] , k ∈ [ µ − (cid:9) . Let J be the set of all the jobs. Further, we set: F = {F s : s ∈ S } ; R = (cid:91) s ∈ S R s ; P = (cid:91) s ∈ S P s ;Additionally, we introduce µ jobs, X = { X , . . . , X µ } , each having a unit length, and a job L of length 3 sµ s T (thus, the length of L is larger than the length of all the jobs of J \ { L } ).There are µs Σ agents in total. For each number s ∈ S we introduce sµ agents. Let A s be theset of these agents. We partition A s into µ sets of s agents: A s, , . . . , A s,µ . Figure 2 represents thepreferred schedule of the j -th agent from A s,i ( i ∈ [ µ ] , j ∈ [ s ]). For all the agents, job X i ( i ∈ [ µ ])starts at time is T + i −
1, and job L starts at time D = s Σ + µ = µ ( s T + 1). Further, for all theagents of A s,i , job F s starts at time is T + ( i − − s (i.e., for these agents, F s is scheduled justbefore job X i ). Further, in this schedule job R s,i,j is put just before job F s : at time ( i − s T + 1),and job P s,i,j,k ( k ∈ [ µ − k − s T + 1) if k < i , and at time k ( s T + 1)if k ≥ i . All the other jobs are scheduled after job L , i.e., at soonest at time D + p L . Let usarbitrarily label the agents from 0 to µs Σ −
1. The jobs of Agent i which are not already scheduled13igure 2: Preferred schedule of the j -th agent of A s,i .before D + p L are scheduled in an arbitrarily order after D + p L , except that the 2 µ − P which are scheduled before D in the preferred schedule of Agent( i + 1 mod µs Σ ), followed by the jobs of P which are scheduled before D in the preferred scheduleof Agent ( i + 2 mod µs Σ ). This will ensure that each job of P appears only twice in the (2 µ − P , only one agent schedules it before D ).We will now show that the answer to the 3-partition problem on instance I is “yes” if and onlyif the optimal schedule for Σ- U on I (cid:48) starts as follows: (cid:16) F , X , F , X , F µ , X µ (cid:17) , where each set F i consists of jobs from F with lengths summing up to s T .If the schedule for Σ- U on I (cid:48) starts as follows: (cid:16) F , X , F , X , F µ , X µ (cid:17) , where each set F i ( i ∈ [ µ ]) consists of jobs from F with lengths summing up to s T , then the solution of 3-partition is“yes” since each job of F has the length of a number of S . Let us now assume that the solution of3-partition on Instance I is “yes”. We will show that the optimal solution of Σ- U on Instance I (cid:48) indeed starts with: (cid:16) F , X , F , X , F µ , X µ (cid:17) .Let us consider an optimal schedule σ ∗ for I (cid:48) . First, we will show that in σ ∗ , each job X i ( i ∈ [ µ ]) is scheduled at latest at time is T + ( i − X i is completed at time is T + ( i − σ ∗ , X i would not be completed at latest at time is T + ( i − T ∈ ( J − X ) is scheduled before X i . In this case, swapping T and X i would not increase thenumber of late jobs. This is the case because scheduling X i before is T + ( i −
1) decreases by µs Σ (this is the number of agents) the number of late jobs, and the length of X i is smaller than the oneof T so the swap of X i and T will not delay jobs other than T .Second, we will show that in σ ∗ , job L is scheduled in the last position. For this we considertwo cases:1. Let us first consider what happens if this job is not late, i.e., if it is scheduled in σ ∗ at latest attime D . Let us now look at the 2 µ − σ ∗ . Each of these jobs is late for at least µs Σ − P , only two agents have it inone of their 2 µ − s T ).Thus the total number of late jobs is at least ( µs Σ − µ − L is late in σ ∗ : it is scheduled after time D . In this case,it will be late for all the agents, so we can assume that it is scheduled in the last position of σ ∗ .Thus, all the jobs of J \ { L } are scheduled before D + p L in σ ∗ (this is true since the length of L is larger that the total length of the jobs of J \ { L } ). Since each job of P ∪ R appears only oncebefore D + p L in the preferred schedules of the agents, each job of P ∪ R will be late for at mostone agent: the number of jobs of
P ∪ R which will be late is thus at most the number of jobsof
P ∪ R : µ s Σ . The number of jobs of F which are scheduled before D + p L in the preferredschedules of the agents is µs Σ (indeed for each s ∈ S , job F s appears for s agents just before job X i , with i ∈ [ µ ]). Thus, the number of jobs from F which will be late in σ ∗ is at most equal to µs Σ . Job L is late for all the µs Σ agents, and we have already seen that the jobs of X are not14ate in σ ∗ . Therefore, the total number of jobs which will be late in σ ∗ is at most ( µ + 2) µs Σ .This is smaller that the lower bound of the number of jobs late if L is not late in σ ∗ (this lowerbound was ( µs Σ − µ − µ > σ ∗ , L is scheduled in the last position.Third, we infer that the jobs of P ∪ R are scheduled in σ ∗ at soonest at time D . If this was notthe case, a job of F would be scheduled after time D . We argue that by swapping this job with ajob of P ∪ R scheduled before D we would not increase the number of late jobs. Indeed, if a job of P ∪ R is completed at latest at time D then it will be, in the best case, scheduled on time for allthe agents, whereas if it is completed after time D (but before D + p L ) it will be late for (only) oneagent. Moving forward a job F s ∈ F which is completed after time D so that it is now completedat latest at time D − F s is shorter that any job of P ∪ R ) will decrease the number of late jobsby at least s ≥ s agents have this job completed at time D − F s is shorter than any job of P ∪ R , then doing such a swap does not make any other joblate.We have seen that in σ ∗ , the jobs scheduled before D are the jobs of X ∪ F , and that the jobsof X are not late (i.e. X i is scheduled at latest at time is T + i − s ∈ S , job F s is completed for s agents at time s T , for s agents at time2 s T + 1, and so forth (for each i ∈ [ µ ] it is completed s times in is T + ( i − σ ∗ ,job F s is completed at latest at time s T , is will not be late for any agent; if it is completed after s T but at latest at time 2 s T + 1, it will be late for s agents, and so forth. For each i ∈ [ µ ] we defineSlot i as the time interval [( i − s T + i − , is T + i − F s is completed inSlot i ( i ∈ [ µ ]), it will be late for ( i − s agents. Since the length of F s is s , it means that eachunit of a job F ∈ F such that F is completed in Slot i adds ( i −
1) to the number of late jobs.Let U i be the sum of the lengths of jobs of F completed in Slot i . The number of late jobs of F isthen (cid:80) µi =1 U i ( i − U —the total length of jobscompleted in Slot 1—should be as large as possible, and then, for the remaining jobs, U should beas large as possible, and so forth. Since X i has to be scheduled at latest at time is T + ( i − i + 1, the number of late jobs is minimized if X i is scheduled exactly at time is T + ( i − F in the slots betweenthe jobs of X . Since we have assumed that there is a “yes” solution to the 3-partition problem oninstance I , it is possible to partition the jobs of F in triples with lengths summing up to s T . Thus,each of this triple will correspond to a triple of jobs in the same slot. Hence the optimal solutionof Σ- U starts as follows: (cid:16) F , X , F , X , F µ , X µ (cid:17) , where each set F i consists of jobs from F withlengths summing up to s T . This completes the proof.Nonetheless, if the jobs have the same size, the problem can be solved in polynomial time(highlighting the additional complexity brought by the main element of the collective scheduling).Our proof uses the idea of Dwork et al. [12] who proved an analogous result for the Spearmandistance. Proposition 10.
If all jobs have the same size, for each delay cost f ∈ { T, U, L, E, D, SD } rule (cid:80) - f can be computed in polynomial time.Proof. Let us fix f ∈ { T, U, L, E, D, SD } . We reduce the problem of finding a collective scheduleto the assignment problem. Observe that when the jobs have all the same size, say p , then in theoptimal schedule each job should be started at time (cid:96)p for some (cid:96) ∈ { , . . . , m − } . Thus, we15onstruct a bipartite graph where the vertices on one side correspond to m jobs and the vertices onthe other side to m possible starting times of these jobs. The edge between a job J and a startingtime (cid:96)p has a cost which is equal to the total cost caused by job J being scheduled to start at time (cid:96)p . The cost can be computed independently of how the other jobs are scheduled, and is equal to (cid:80) a ∈ N f ( (cid:96)p + 1 , C i ( σ a ) (cid:1) . Thus, a schedule that minimizes the total cost corresponds to an optimalassignment of m jobs to their m slots. Such an assignment can be found in polynomial time, e.g.,by the Hungarian algorithm.We conclude this section by observing that hardness of computing (cid:80) - K and (cid:80) - S rules can bededuced from the hardness of computing Kemeny rankings [12]. Proposition 11.
Computing (cid:80) - K and (cid:80) - S is NP -hard even for n = 4 agents and when all jobshave the same unit size. L p -norm of Delay Costs, p > We start by observing that the general case is hard even for two agents. The proof of the belowtheorem works also for p = ∞ , i.e., for max- { T, E, D } . Theorem 12.
For each p > , finding a schedule returned by L p - { T, E, D } is NP -hard, even fortwo agents.Proof. Let us fix p >
1. We show a reduction from
Partition . In
Partition we are given a set ofintegers S = { s , . . . , s n } , s i < s i +1 , and we ask whether S can be partitioned in two sets S a and S b that have the same sum, s = / · (cid:80) s i .We construct an instance of the problem of finding an optimal collective schedule according to L p - T as follows (our construction is inspired by Agnetis et al. [1]). For each s i ∈ S we introducetwo jobs J ( a ) i and J ( b ) i , both with length p ( a ) i = p ( b ) i = s i . We have two agents, a and b . Both agentsprefer a schedule executing jobs in order of their increasing lengths (an SPT schedule). For eachpair of jobs ( J ( a ) i , J ( b ) i ) with equal lengths, agent a prefers J ( a ) i to J ( b ) i , while agent b prefers J ( b ) i to J ( a ) i . Thus, the preferred schedule σ a of agent a is ( J ( a )1 , J ( b )1 , J ( a )2 , J ( b )2 , . . . , J ( a ) n , J ( b ) n ); while σ b is( J ( b )1 , J ( a )1 , J ( b )2 , J ( a )2 , . . . , J ( b ) n , J ( a ) n ). We ask whether there exist a schedule with a cost of p √ s p .Assume there exists a partition of S into two disjoint sets S a , S b where (cid:80) v ∈ S a v = (cid:80) v ∈ S b v = s . We construct an SPT schedule σ = ( J ( · )1 , J ( · )1 , J ( · )2 , J ( · )2 , . . . , J ( · ) n , J ( · ) n ). For each pair of jobs { J ( a ) i ) , J ( b ) i ) } of equal length, if s i ∈ S a , the jobs are scheduled in order ( J ( a ) i , J ( b ) i ); otherwise (i.e., s i ∈ S b ) in order ( J ( b ) i , J ( a ) i ). For each pair of jobs, the order ( J ( a ) i , J ( b ) i ) increases agent b ’s tardinessby p ( a ) i = s i ; while agent a ’s tardiness is not increased. Similarly, the order ( J ( b ) i , J ( a ) i ) increasesagent a ’s tardiness by p ( b ) i = s i . Consequently, T ( σ, σ a ) = S b = s and T ( σ, σ b ) = S a = s , and thusthe total cost is p √ s p .Assume there is a schedule σ with the total cost of p √ s p . We first show it has to be an SPTschedule. For the sake of contradiction, assume that a non-SPT schedule σ (cid:48) has the minimal cost.Pick two jobs J i , J j scheduled in a non-SPT order in σ (cid:48) : J i is scheduled before J j , but p i > p j . Ifwe switch the order of jobs, the jobs J k executed between J j and J i complete earlier. Moreover,as both agents prefer J j to J i , the T tardiness measure drops ( J k are less late in the switchedschedule). Thus, the switched schedule has a lower cost, which contradicts the assumption that σ (cid:48) is optimal. 16ext, observe that the sum of agents’ tardiness measures is at least 2 s . Consider an SPTschedule. For each pair of jobs ( J ( a ) i , J ( b ) i ), if J ( a ) i is scheduled before J ( b ) i , T ( σ, σ b ) is increasedby p i ; otherwise, T ( σ, σ a ) is increased by p i . As T ( σ, σ a ) + T ( σ, σ b ) = 2 s , by the convexity of the L p -norm (with p > p √ s p has to be a schedule with the minimalcost. Further, this cost is equal to p √ s p , if and only if the total tardiness of agents a and b areequal.Now, observe that the order of a pair of jobs with equal length { J ( a ) i ) , J ( b ) i ) } defines thus thepartition: the order ( J ( b ) i , J ( a ) i ) corresponds to s i in S a ; the order ( J ( a ) i , J ( b ) i ) corresponds to s i in S b . As T ( σ, σ a ) = T ( σ, σ b ) = s , (cid:80) s i ∈ S a s i = (cid:80) s i ∈ S b s i = s .Moreover, as shown below, max- { T, E, D, SD } is NP-hard even for unit-size jobs. Theorem 13.
For each delay cost f ∈ { T, E, D, SD } , finding a schedule returned by max - f is NP -hard, even for unit-size jobs.Proof. We reduce from the
ClosestString , which is NP-hard even for the binary alphabet. Let I be an instance of ClosestString with the binary alphabet. In I we are given a set of n m , and an integer d ; we ask if there exists a “central string” with themaximum Hamming distance to the input strings no greater than d .From I we construct an instance I (cid:48) of max- f collective schedule in the following way. We have2 m jobs: for each i ∈ [ m ] we introduce two jobs, J ( a ) i and J ( b ) i . For each input string s we introduceone agent: the agent puts a job J ( · ) i before J ( · ) j in her preferred schedule whenever i < j . Further,she puts J ( a ) i before J ( b ) i if s has “one” in the i -th position and J ( b ) i before J ( a ) i , otherwise.Let us call a schedule where J ( · ) i is put before J ( · ) j whenever i < j , a regular schedule . Weconsider the schedule σ ∗ returned by max- f , and we show that this schedule is regular (or that itcan be transformed into a regular schedule of the same cost). Let us consider that there is in σ ∗ two jobs J ( · ) i and J ( · ) j such that J ( · ) j is scheduled before J ( · ) i whereas i < j . Swapping J ( · ) j with J ( · ) i changes only J ( · ) j and J ( · ) i completion times (as jobs are unit-size). By case analysis on both jobs’positions relative to 2 i and 2 j (6 cases, as j is before i ), for any f ∈ { T, E, D, SD } , swapping thesejobs does not increase f . Thus, if σ ∗ is not regular, we can transform it into a regular scheduleas follows: by swapping J ( · )1 with another job J ( · ) k (if J ( · )1 is not at position 1 or 2, whereas J ( · ) k ,with k >
1, is at one of these positions), we do not increase the cost f of the schedule, and thuswe obtain a schedule where the jobs J ( · )1 are at their regular positions. We continue with at most2 m such swaps for the remaining positions i ∈ [ m ], ending up with a regular schedule.Let us now consider that f = T (resp. f = E ). Observe that if we put J ( a ) i before J ( b ) i in aregular schedule, then we increase the tardiness (resp. earliness) of each agent having “zero” in the i -th position by one. Conversely, if we schedule J ( b ) i before J ( a ) i , then we increase the tardiness (resp.earliness) of agents having “one” in the i -th position by one. Thus, a (regular) collective schedulecorresponds to a “central string”: J ( a ) i scheduled before J ( b ) i in a collective schedule correspondsto a central string having “one” in the i -th position, and J ( b ) i scheduled before J ( a ) i , correspondsto “zero”. With such interpretation, the max- T (resp. max- E ) of a regular schedule is simply themaximum Hamming distance to the input strings. Consequently, we get that the answer to theinitial instance I is “yes”, iff the optimal solution for I (cid:48) is a schedule with max- T (resp. max- E )not larger than d .When f = D (resp. f = SD ), the principle of the proof is the same: J ( a ) i before J ( b ) i in aregular schedule increases the deviation (resp. squared deviation) of each agent having “zero” in17he i -th position by two. Conversely, if we schedule J ( b ) i before J ( a ) i , then we increase the deviation(resp. squared deviation) of agents having “one” in the i -th position by one. Consequently, we getthat the answer to the initial instance I is “yes”, iff the optimal solution for I (cid:48) is a schedule withmax- D (resp. max- SD ) not larger than 2 d . The goal of our experimental evaluation is, first, to demonstrate that, while most of the problemsare NP-hard, an Integer Linear Programming (ILP) solver finds optimal solutions for instanceswith reasonable sizes. Second, to quantitatively characterize the impact of collective schedulingcompared to the base social choice methods. Third, to compare schedules built with differentapproaches (cost functions and axioms). We use tardiness T as a representative cost function: itis NP-hard in both Σ and max aggregations; and easy to interpret. Settings.
A single experimental scenario is described by a profile with preferred schedules of theagents and by a maximum length of a job p max . We instantiate the preferred schedules of agentsusing PrefLib [22]. We treat PrefLib’s candidates as jobs. We use datasets where the agents havestrict preferences over all candidates. We restrict to datasets with both large number of candidatesand large number of agents: we take two datasets on AGH course selection ( agh1 with 9 candidatesand 146 agents; and agh2 with 7 candidates and 153 agents) and sushi dataset with 10 candidatesand 5000 agents. Additionally, we generate preferences using the Mallows [21] model ( mallows )and Impartial Culture ( impartial ), both with 10 candidates and 500 agents. We use three differentvalues for p max : 10, 20 and 50. For each experimental scenario we generate 100 instances—in eachinstance pick the lengths of the jobs uniformly at random between 1 and p max (in separate seriesof experiments we used exponential and normal distributions; we found similar trends to the onesdiscussed below). For each scenario, we present averages and standard deviations over these 100instances. Computing Optimal Solutions.
We use standard ILP encoding: for each pair of jobs ( i, j ), weintroduce two binary variables prec i,j and prec j,i denoting precedence: prec i,j = 1 iff i precedes j in the schedule. ( prec i,j + prec j,i = 1 and, to guarantee transitivity of prec , for each triple i, j, k ,we have prec i,j + prec j,k − prec i,k ≤ agh instance takes, on the average, less than a second to solve, while a sushi instance takes roughly 20seconds. In a separate series of experiments, we analyze the runtime on impartial instances as afunction of number of jobs and number of voters. A 20 jobs, 500 voters instance with (cid:80) - T goaltakes 8 seconds; while a max- T goal takes two minutes. A 10 jobs, 5000 voters takes 8 secondswith (cid:80) - T goal and 28 seconds with max- T goal. Finally, 20 jobs, 5000 voters take 23 secondsfor with (cid:80) - T and 20 minutes with max- T . For 30 jobs, the solver does not finish in 60 minutes.Running times depend thus primarily on the number of jobs and on the goal. We conclude that,while the problem is strongly NP-hard, it can be solved in practice for thousands of voters and upto 20 jobs. We consider these running times to be satisfactory: first, for a population it might bedifficult to meaningfully express preferences for dozens of jobs [23] (therefore, the decision makerwould probably combine jobs before eliciting preferences); second, gathering preferences takes non-negligible time; and, finally, in our motivating examples (public works, lecture hall) individual jobslast hours to weeks. Analysis of the Results.
First, we analyze job’s rank as a function of its length. We computea reference collective schedule for an instance with the same agents’ preferences, but unit-size jobs18 − − ∆ p o s i t i o n AGH IMPARTIALMALLOWSSUSHI Σ- T − − ∆ p o s i t i o n AGH IMPARTIALMALLOWSSUSHI max- T Figure 3: The average change in jobs’ position. A point ( x, y ) in the plot denotes that a job oflength x is on the average scheduled by y positions later than when we ignore jobs’ durations. p max = 10 ( p max = 20 and p max = 50 show very similar trends.)Dataset PTA C. Paradox PTA Copeland · / · ∆GiniΣ- T max- T Σ- T max- T agh1
6% 15% 1.03 1.23 0.07 agh2
5% 18% 1.03 1.28 0.12 sushi
7% 24% 1.02 1.22 0.06 impartial
3% 8% 1.00 1.01 0.00 mallows
10% 24% 1.03 1.21 0.08Table 1: “PTA C. Paradox” gives the mean frequencies of violating the PTA Condorcet principlefor optimal solutions for Σ- T and max- T . “PTA Copeland · / · ” denotes the ratio of sum/max T forPTA Copeland’s schedule to their optimums. “∆Gini” shows the average of differences in the Giniindices: Gini(max- T ) - Gini(Σ- T ).(it thus corresponds to the classic preference aggregation problem with Σ- T or max- T goal). Wethen compute and analyze the collective schedules. Over 100 instances, as jobs’ durations areassigned randomly, all the jobs’ durations should be in the preferred schedules in, roughly, allpositions. Thus, on the average , short jobs should be executed earlier, and long jobs later thanin the reference schedule (in contrast, in any single experiment, if a large majority puts a shortjob at the end of their preferred schedules, the job is not automatically advanced). To confirmthis hypothesis, for each instance and each job we compare its position to the position in thereference schedule. Figure 3 shows the average position change as a function of the job lengths. Incollective schedules, short jobs (e.g., of size 1) are advanced, on the average, 2-4 positions in theschedule, compared to schedules corresponding to the standard preference aggregation problem.The experiments thus confirm that the lengths of the jobs have profound impact on the schedule.Second, we check how frequent are PTA-Condorcet paradoxes. For each instance, we countedhow many out of (cid:0) m (cid:1) job pairs are scheduled in a non-PTA-Condorcet consistent order. Table 1shows that both Σ- T and max- T often violate the PTA Condorcet principle. Table 1 also showsthe average ratio between the (Σ and max) tardiness of schedules returned by the PTA Copeland’srule, and the tardiness of optimal corresponding schedules. These ratios are small: roughly 3%degradation for Σ and 24% for max. Thus, though PTA Copeland’s rule does not explicitly optimizemax- T and Σ- T , on average, it returns schedules close to the optimal for these criteria.19hird, we analyze how fair are Σ- T and max- T . We analyzed Gini indices of the vectors ofagents’ tardiness. Table 1 shows that, interestingly, Σ- T is more fair (smaller average Gini index),even though max- T seemingly cares more about less satisfied agents. Yet, the focus of max- T on the worst-off agent makes it effectively ignore all the remaining agents, increasing the societalinequality. The principal contribution of this paper is conceptual—we introduce the notion of the collectiveschedule . We believe that collective scheduling addresses natural problems involving jobs or eventshaving diverse impacts on the society. Such problems do not fit well into existing scheduling models.We demonstrated how to formalize the notion of the collective schedule by extending well-knownmethods from social choice. While collective scheduling is closely related to preference aggregation,these methods have to be extended to take into account lengths of jobs. Notably, we proposedto judge the quality of a collective schedule by comparing the jobs’ completion times betweenthe collective and the agents’ preferred schedules. We also showed how to extend the Condorcetprinciple to take into account lengths of jobs.We conclude that there is no clear winner among the proposed scheduling mechanisms. Sim-ilarly, in the classic voting, there is no clear consensus regarding which voting mechanism is thebest. For example, we showed that the comparison of the cost-based and PTA-Condorcet-basedscheduling exposes a tradeoff between reinforcement and the PTA Condorcet principle. Thus, thequestion which mechanism to choose is, for example, influenced by the subjective assessment of themechanism designer with respect to which one of the two properties she considers more important.Our main conclusion from the theoretical analysis of computational complexity and from theexperimental analysis is that using cost-based scheduling methods is feasible only if the sizes of theinput instances are moderate (though, these instances may represent many realistic situations). Incontrast, PTA Condorcet-based methods are feasible even for large instances. We drew a boundarybetween NP-hard and polynomial-time solvable problems. In several cases, problems become NP-hard with non-unit jobs, therefore showing additional complexity stemming from scheduling, asopposed to standard voting. Moreover, our experiments suggest that there is a clearly visibledifference between schedules returned by different methods of collective scheduling.Both scheduling and social choice are well-developed fields with a plethora of models, methodsand results. It is natural to consider more complex scheduling models in the context of collectivescheduling, such as processing several jobs simultaneously (multiple processors with sequentialor parallel jobs), jobs with different release dates or dependencies between jobs. Each of theseextensions raises new questions on computability/approximability of collective schedules. Anotherinteresting direction is to derive desired properties of collective schedules (distinct from PTA-Condorcet), and then formulate scheduling algorithms satisfying them.
Acknowledgments
This research has been partly supported by the Polish National Science Center grant Sonata (UMO-2012/07/D/ST6/02440), a Polonium grant (joint programme of the French Ministry of ForeignAffairs, the Ministry of Science and Higher Education and the Polish Ministry of Science andHigher Education) and project TOTAL that has received funding from the European ResearchCouncil (ERC) under the European Union’s Horizon 2020 research and innovation programme(grant agreement No 677651). 20iotr Skowron was also supported by a Humboldt fellowship for postdoctoral researchers.
References [1] A. Agnetis, P. B. Mirchandani, D. Pacciarelli, and A. Pacifici. Scheduling problems with twocompeting agents.
Operations research , 52(2):229–242, 2004.[2] A. Agnetis, J. Billaut, S. Gawiejnowicz, D. Pacciarelli, and A. Soukhal.
Multiagent Scheduling:Models and Algorithms . Springer, 2014.[3] K. J. Arrow, A. Sen, and K. Suzumura, editors.
Handbook of Social Choice & Welfare , vol-ume 2. Elsevier, 2010.[4] G. Benade, S. Nath, A. Procaccia, and N. Shah. Preference elicitation for participatory bud-geting. In
Proceedings of the 31st Conference on Artificial Intelligence (AAAI-2017) , pages376–382, 2017.[5] P. Brucker.
Scheduling Algorithms . Springer, 2006.[6] Y. Cabannes. Participatory budgeting: a significant contribution to participatory democracy.
Environment and Urbanization , 16(1):27–46, 2004.[7] I. Caragiannis, E. Hemaspaandra, and L. A. Hemaspaandra. Dodgson’s rule and Young’s rule.In F. Brandt, V. Conitzer, U. Endriss, J. Lang, and A. D. Procaccia, editors,
Handbook ofComputational Social Choice , chapter 2. Cambridge University Press, 2015.[8] E. Celis, D. Straszak, and N. Vishnoi. Ranking with fairness constraints. Technical ReportarXiv:1704.06840, arXiv.org, 2017.[9] J. Colomer. Ramon Llull: from ‘Ars electionis’ to social choice theory.
Social Choice andWelfare , 40(2):317–328, 2013.[10] J. Du and J. Leung. Minimizing total tardiness on one machine is np-hard.
Mathematics ofoperations research , 15(3):483–495, 1990.[11] P. F. Dutot, F. Pascual, K. Rzadca, and D. Trystram. Approximation algorithms for the mul-tiorganization scheduling problem.
IEEE Transactions on Parallel and Distributed Systems ,22(11), 2011.[12] C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the web.In
Proceedings of the 10th international conference on World Wide Web (WWW-2001) , pages613–622. ACM, 2001.[13] B. Fain, A. Goel, and K. Munagala. The core of the participatory budgeting problem. In
Proceedings of the 12th Conference on Web and Internet Economics (WINE-2016) , pages384–399, 2016.[14] P. Faliszewski, P. Skowron, A. Slinko, and N. Talmon. Multiwinner voting: A new challengefor social choice theory. In U. Endriss, editor,
Trends in Computational Social Choice . AIAccess, 2017. 2115] F. Fischer, O. Hudry, and R. Niedermeier. Weighted tournament solutions. In F. Brandt,V. Conitzer, U. Endriss, J. Lang, and A. D. Procaccia, editors,
Handbook of ComputationalSocial Choice , chapter 2. Cambridge University Press, 2015.[16] A. Goel, A. Krishnaswamy, S. Sakshuwong, and T. Aitamurto. Knapsack voting: Votingmechanisms for participatory budgeting. Manuscript, 2016.[17] M. G. Kendall. A new measure of rank correlation.
Biometrika , 30(1/2):81–93, 1938.[18] E. Koutsoupias and C. H. Papadimitriou. Worst-case equilibria.
Computer Science Review , 3(2):65–69, 2009.[19] J. Lang and L. Xia. Voting over multiattribute domains. In F. Brandt, V. Conitzer, U. Endriss,J. Lang, and A. Procaccia, editors,
Handbook of Computational Social Choice , chapter 9.Cambridge University Press, 2015.[20] J. Levin and B. Nalebuff. An introduction to vote-counting schemes.
Journal of EconomicPerspectives , 9(1):3–26, 1995.[21] C. L. Mallows. Non-null ranking models. I.
Biometrika
Proceedings of the 3rd International Conference on Algorithmic Decision Theory (ADT-2013) ,pages 259–270, 2013.[23] G. Miller. The magical number seven, plus or minus two: Some limits on our capacity forprocessing information.
The Psychological Review , 63:81–97, 1956.[24] PBP. PBP. Where has it worked?—the participatory budgeting project. , 2016.[25] P. Skowron, P. Faliszewski, and J. Lang. Finding a collective set of items: From proportionalmultirepresentation to group recommendation.
Artificial Intelligence , 241:191–216, 2016.[26] P. Skowron, M. Lackner, M. Brill, D. Peters, and E. Elkind. Proportional rankings. In
Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI-2017) ,pages 409–415, 2017.[27] B. V¨ocking. Selfish load balancing. In
Algorithmic Game Theory . Cambridge, 2007.[28] H. Young and A. Levenglick. A consistent extension of Condorcet’s election principle.