[PDF] Auditing Hamiltonian Elections

Abstract

Presidential primaries are a critical part of the United States Presidential electoral process, since they are used to select the candidates in the Presidential election. While methods differ by state and party, many primaries involve proportional delegate allocation using the so-called Hamilton method. In this paper we show how to conduct risk-limiting audits for delegate allocation elections using variants of the Hamilton method where the viability of candidates is determined either by a plurality vote or using instant runoff voting. Experiments on real-world elections show that we can audit primary elections to high confidence (small risk limits) usually at low cost.

Full PDF

aa r X i v : . [ c s . C Y ] F e b Auditing Hamiltonian Elections

Michelle Blom − − − , Philip B. Stark − − − ,Peter J. Stuckey − − − , Vanessa Teague − − − ,and Damjan Vukcevic , − − − School of Computing and Information Systems, University of Melbourne, Parkville,Australia [email protected] Department of Statistics, University of California, Berkeley, USA Department of Data Science and AI, Monash University, Clayton, Australia Thinking Cybersecurity Pty. Ltd. School of Mathematics and Statistics, University of Melbourne, Parkville, Australia Melbourne Integrative Genomics, University of Melbourne, Parkville, Australia

Abstract.

Presidential primaries are a critical part of the United StatesPresidential electoral process, since they are used to select the candi-dates in the Presidential election. While methods diﬀer by state andparty, many primaries involve proportional delegate allocation using theso-called Hamilton method. In this paper we show how to conduct risk-limiting audits for delegate allocation elections using variants of theHamilton method where the viability of candidates is determined ei-ther by a plurality vote or using instant runoﬀ voting. Experiments onreal-world elections show that we can audit primary elections to highconﬁdence (small risk limits) usually at low cost.

Presidential primary elections are a critical part of the United States electoralprocess, since they are used to select the ﬁnal candidates contesting the Presi-dential election for each of the major parties. For that reason it is important thatthe result of these primaries be trustworthy. While the method used for primaryelections diﬀers by party and state, the majority of such elections use delegate al-location by proportional representation, the so-called Hamilton method, namedafter its inventor, Alexander Hamilton.Risk-limiting audits (RLAs) [9] require a durable, trustworthy record of thevotes, typically paper ballots marked by hand, kept demonstrably secure. RLAsend in one of two ways: either they produce strong evidence that the reportedwinners really won, or they result in a full manual tabulation of the paper records.If a RLA leads to a full manual tabulation, the outcome of the tabulation replacesthe original reported outcome if they diﬀer, thus correcting the reported outcome(if the paper trail is trustworthy). The probability that a RLA fails to correcta reported outcome that is incorrect before that outcome becomes oﬃcial isbounded by a “risk limit.” An RLA with a risk limit of 1%, for example, has at

Authors Suppressed Due to Excessive Length most a 1% chance of failing to correct a reported election outcome that is wrong;equivalently, it has at least a 99% chance of correcting the reported outcome ifit is wrong. RLAs are becoming the de-facto standard for post-election audits.They are required by statute in Colorado, Nevada, Rhode Island, and Virginia, for some government elections (not primaries which are party elections), and havebeen piloted in over a dozen US states and Denmark. They are recommended bythe US National Academies of Science, Engineering, and Medicine and endorsedby the American Statistical Association. Risk-limiting audits of limited scopehave begun to be applied to US primary elections; our methods here wouldallow RLAs of the full elections.In this paper we describe the ﬁrst method that we are aware of for conductingan RLA for delegate allocation by proportional representation elections, whichwe call Hamiltonian elections . In addition to primary elections in some statesin the USA, this type of election is used in Russia, Ukraine, Tunisia, Taiwan,Namibia and Hong Kong. We do so by adapting auditing methods designed forplurality and instant runoﬀ voting (IRV) elections for auditing the viability ofcandidates, and generating a new kind of audit for proportional allocation.A delegate allocation election by proportional representation is a complexform of election. Rather than simply electing candidates, the result of the electionis to assign some number of delegates to some of the candidates. In the ﬁrst stageof the election, the process determines the subset of candidates that are eligibleor viable (for Democratic primaries, candidates need to receive at least 15% ofthe vote). In the second step, delegates are awarded to these viable candidates inapproximate proportion to their vote. An RLA must determine the correctnessof both the set of viable candidates and the number of delegates assigned to eachviable candidate.The ﬁrst stage of the election uses either simple plurality voting, where eachballot is a vote for at most one candidate, or IRV, where each ballot is a rankingof some or all candidates. In IRV, candidates with the fewest ﬁrst-choice ranksare eliminated and each ballot that ranked them ﬁrst is reassigned to the nextmost-preferred ranked candidate on that ballot.There is considerable work on both comparison audits and ballot-pollingaudits for plurality elections [6,11], but few for more complex election types.Sarwate et al. [8] consider IRV and some other preferential elections. Kroll etal. [5] show how to audit the overall US electoral college outcome, but not theallocation of individual delegates. Stark and Teague [12] devise audits for theD’Hondt method for proportional representation, which is related to but distinctfrom Hamiltonian methods. Blom et al. [2,3] describe eﬃcient audits for IRV.As far as we know, there is no other auditing method for Hamiltonian Elections,nor any that combines a proportional representation method with IRV. Virginia’s audit does not take place until after the outcome is certiﬁed, so it cannotlimit the risk that an incorrect reported outcome will become ﬁnal: technically, it isnot a RLA.uditing Hamiltonian Elections 3Candidate Votes ProportionAnn 57,532 76.1%Bob 15,630 20.6%Cal 1,600 2.1%Dee 846 1.1%Total Votes

Candidate Votes ProportionAnn 57,532 78.6%Bob 15,630 21.4%Qualiﬁed Votes (a) (b)

Fig. 1. (a) Votes and (b) Qualiﬁed Votes in a Hamiltonian election with plurality-basedexclusion.

We have a set of n candidates C , a set of cast ballots B , and a number ofdelegates D to be awarded to the candidates based on the votes. The Hamilton or largest remainder method, invented by Alexander Hamilton in 1792, allocatesthe delegates in approximate proportion to the votes the candidates received.In a pure Hamiltonian election , also known as the Hamilton method , delegatesare directly allocated based on the proportion of the vote. But most delegateelections use some form of exclusion of some candidates before the delegates areapportioned.A

Hamiltonian election with exclusion ﬁrst determines which candidates in C are viable —eligible to be awarded one or more delegates. Typically, exclusioninvolves a plurality vote. Each ballot is a vote for at most one candidate. If acandidate receives a threshold proportion T of the valid votes, the candidateis considered viable. The votes cast for viable candidates are referred to as qualiﬁed votes . The qualiﬁed votes are used to allocate delegates, as describedlater in this section.

Example 1.

Consider an example Hamiltonian election with exclusion with 4 can-didates, Ann, Bob, Cal, and Dee and a viability threshold of T = 15%. Fig-ure 1(a) shows the tally of votes for each candidate, and the percentage of theoverall vote that each candidate received. Ann and Bob received more than 15%of the vote and are viable candidates.For elections with many candidates, a plurality exclusion might eliminateall of them. In an instant-runoﬀ Hamiltonian election the viable candidatesare determined by a form of IRV. Each ballot is now a partial ranking of thecandidates, and the viable candidates are determined as follows:1. Initialize the set of candidates. Each ballot is put in the pile for the candidateranked highest on that ballot.2. If every (remaining) candidate has > T of the votes in their pile, we ﬁnishthe candidate selection process. All of these remaining candidates are viable. There are more complicated alternate rules for the case where no candidate reaches T ; we do not consider this case here. Authors Suppressed Due to Excessive Length

3. Otherwise, the candidate with the lowest tally (fewest ballots in their pile)is eliminated, and each of their ballots is moved to the pile of the nextranked remaining candidate on the ballot. A ballot is exhausted if all furthercandidates mentioned on the ballot have already been eliminated.4. We then return to step 2.

Example 2.

Consider an instant-runoﬀ Hamiltonian election with the same fourcandidates as Example 1, the same threshold, and 50,000 ballots with ranking[A,D,C,B] (that is, Ann followed by Dee, then Cal, then Bob), 9,630 of [B,C],6,000 of [C,B], 1,600 of [C], 7,532 of [D,A,C], and 846 of [D,C]. The IRV electionproceeds as follows. In the ﬁrst round Cal has the lowest tally, 7,600 votes, whichis 10.052% of the total vote, and hence less than 15%. Cal is eliminated: the 6,000ballots [C,B] are transferred to Bob, and the 1,600 ballots [C] are exhausted(removed from consideration). In the next round Dee has the lowest tally, 8,378votes which is 11.080%, so Dee is eliminated. The 7,632 [D,A,C] ballots aretransferred to Ann, and the remaining 836 [D,C] ballots are exhausted. In theﬁnal round, Bob has the lowest tally, 20.672% of the vote, and the process endssince this is greater than 15%. The election is summarized in Figure 2.

Round 1 Round 2 Final ResultCand. Ballot Number Prop. Ballot Number Prop. Ballot Number Prop.Ann [ A ,D,C,B] 50,000 66.1% [ A ,D,C,B] 50,000 66.1% [ A ,D,C,B] 50,000[D, A ,C] 7,532 76.1%Bob [ B ,C] 9,630 12.7% [ B ,C] 9,630 [ B ,C] 9,630[C, B ] 6,000 20.7% [C, B ] 6,000 20.7%Cal [ C ,B] 6,000 — —[ C ] 1,600 10.1% — — — —Dee [ D ,A,C] 7,532 [ D ,A,C] 7,532 —[ D ,C] 846 11.1% [ D ,C] 846 11.1% — —Total Fig. 2.

IRV election for four candidates showing the elimination of ﬁrst Cal, and thenDee, and the ﬁnal round results.

The second stage in the process is to assign delegates to candidates on thebasis of their tallies. We ﬁrst compute, for each viable candidate c , the proportionof the qualiﬁed votes in their tally, p c . Recall that we refer to ballots belongingto viable candidates as qualiﬁed votes. We denote the number of qualiﬁed votesas Q . In the context of IRV, ballots are qualiﬁed if they end up in the tally of aviable candidate. Non-qualiﬁed ballots result from exhaustion: every candidatein the ballot ranking has been eliminated (is non-viable). Where a pluralitycontest determines viability, all votes for a viable candidate are qualiﬁed.We denote the set of viable candidates as V . Delegates are awarded to viablecandidates as follows: uditing Hamiltonian Elections 5

1. We compute for each viable candidate c their delegate quota , q c = D × p c where p c is the proportion of the qualiﬁed vote given to c (their ﬁnal tallydivided by Q ).2. We assign i c = ⌊ q c ⌋ delegates to each candidate c ∈ V .3. At this stage, there are r = D − P c ∈ V i c remaining delegates to assign.We assign these delegates to the r candidates with the largest value of theremainder q c − i c . One delegate is given to each of these r candidates.4. At this stage, each viable candidate c has received a c total delegates, where a c is q c rounded either up or down. Example 3.

The end result of Examples 1 and 2 is the same. The qualiﬁed voteis Q = 73 , p Ann = 0 . p Bob = 0 . D = 5 delegates toallocate, we ﬁnd q Ann = 3 .

932 and q Bob = 1 . .

932 and 0 . a Ann = 4 and a Bob = 1.

A risk-limiting audit is a statistical test of the hypothesis that the reportedoutcome is incorrect. (In the current context, the reported outcome is the numberof delegates ﬁnally awarded to each candidate.) If that hypothesis is not rejected,there is a full hand tabulation, which reveals the true outcome. If that diﬀersfrom the reported outcome, it replaces the reported outcome. The signiﬁcancelevel of the test is called the risk limit . A risk-limiting audit of a trustworthypaper trail of votes limits the risk that an incorrect electoral outcome will gouncorrected.Two common building blocks for audits are to compare manual interpreta-tion of randomly selected ballots or groups of ballots with how the voting systeminterpreted them (a comparison audit [10]), and to use only the manual interpre-tation of the randomly selected ballots (a ballot-polling audit [7]). Ballot pollingrequires less infrastructure (some voting systems do not export the data requiredfor a comparison audit) but generally requires inspecting more ballots.Recent work [11] shows that audits of most social choice functions can be re-duced to checking a set of assertions . If all the assertions are true, the reportedelection outcome is correct. Each assertion is checked by conducting a hypothesistest of its logical negation. To reject the hypothesis that the negation is true isto conclude that the assertion is true. Each hypothesis is tested using a statisticcalculated from the audit data. Larger values of the statistic are unlikely if thecorresponding assertion is false. If the statistic takes a suﬃciently large value,that is statistical evidence that the assertion is true, because such a large valuewould be very unlikely if the assertion were false. The statistic is generally cali-brated to give sequentially valid tests of the assertions, meaning that the sampleof ballots can be expanded at will and the statistic can be recomputed from theexpanded sample, while controlling the probability of erroneously concludingthat the assertion is true if the assertion is in fact false.

Authors Suppressed Due to Excessive Length

The initial sample size is generally chosen so that there is a reasonable chancethat the audit will terminate without examining additional ballots if the reportedresults are approximately correct. If the initial sample does not give suﬃcientlystrong evidence that all the assertions are correct, the sample is augmented andthe condition is checked again. The sample continues to expand until either allthe assertions have been conﬁrmed or the sample contains every ballot, and thecorrect result is therefore known. At any point during the audit, the auditor canchoose to conduct a full manual tabulation. If the audit leads to a full manualtabulation, the outcome of that tabulation replaces the reported outcome if theydiﬀer.The basic assertions for Hamiltonian elections are: (Super/sub) majority p > t , where p is the proportion of ballots that satisfysome condition (usually the condition is that the ballot has a vote for aparticular candidate) among ballots that meet some validity condition, and t is a proportion in (0 , Pairwise majority p A > p B , where p A and p B are the proportions of ballotsthat meet two mutually exclusive conditions A and B , among ballots thatmeet some validity condition. (Typically, among the ballots that contain avalid vote, A is a ballot with a vote for one candidate, and B is a ballot witha vote for a diﬀerent candidate). Pairwise diﬀerence p A > p B + d , where p A and p B are the proportions of bal-lots that meet two mutually exclusive conditions among ballots that meetsome validity condition, and d is a constant in the range ( − , assorter (which assigns each ballot anonnegative, bounded number) is greater than 1/2. The value the assorter assignsto a ballot is generally a function of the votes on that ballot and others and thevoting system’s interpretation of the votes on that ballot and others.For majority assertions, a ballot that satisﬁes the condition is assigned thevalue 1 / (2 t ); a valid ballot that does not satisfy the condition is assigned thevalue 0; and an invalid ballot is assigned the value 1/2. For pairwise majorityassertions, a ballot for class A counts as 1 and a ballot for class B counts as 0.Ballots that fall outside both classes count as 1/2. For sequentially valid test statistics, the sample can be augmented at will; for othermethods, there may be an escalation schedule prescribing a sequence of sample sizesbefore conducting a full manual tabulation. In other words, the hypothesis that the assertion is false has been rejected at asuﬃciently small signiﬁcance level.uditing Hamiltonian Elections 7

For pairwise diﬀerence assertions, we deﬁne the assorter g which assigns ballot b the value: g ( b ) ≡  / (1 + d ) , b is a vote of class A0 , b is a vote of class B1 / (2(1 + d )) , b is a valid vote not in A or B1 / , b is an invalid vote.Let ¯ g be the mean of g over the ballots. We have that 0 g ( b ) / (1 + d ),and ¯ g > / p A > p B + d . When d = 0 this reduces to the pairwise majorityassorter if the “valid” category is the same.The margin m of an assertion a is equal to 2 times the mean of its assorter(when applied to all ballots B ) minus 1. An assertion with a smaller margin willbe harder to audit than an assertion with a larger margin. The sample size required to conﬁrm an assertion depends on the sampling designand the auditing strategy (e.g., sampling individual ballots or batches of ballots,using ballot polling or comparison); the “risk-measuring function” (see [11]); andthe accuracy of the tally, among other things. Because it depends on what thesample reveals, it is random.There is some ﬂexibility in selecting a set of assertions to conﬁrm IRV contests[1], so the set can be chosen to minimize a measure of the anticipated workload.We will estimate the workload on the assumption that the assertion is true butthe reported tallies are not exactly correct. We will use the expected sample sizeas a measure of workload. Our auditing approach is applicable to any style of auditing. The workload,given a set of assertions, varies depending on the style of audit (e.g., ballot-level comparison, batch-level comparison, ballot-polling, or a combination ofthose) and the sampling design (e.g., with or without replacement, Bernoulli,stratiﬁed or not, weighted or not). For the purpose of illustration, in the examplesand experiments in this paper, we assume that the audit will be a ballot-levelcomparison audit using sampling with replacement.Because the sample is drawn with replacement, the same ballot can be drawnmore than once. Given an assertion a , let ASN ( a, α ) denote the expected numberof draws required to verify a to risk limit α , and if A is a set of assertions,let ASN ( A , α ) denote the expected number of draws required to verify everyassertion a in A to risk limit α . ASN depends on several factors: the risk limit α ; the expected rate of errors (discrepancies) between paper ballots and theirelectronic records of various signs and magnitudes (in the context of comparisonauditing); and the margins of the assertions. One might instead seek to minimize a quantile of the sample size or some otherfunction of the distribution of sample size, for instance, to account for ﬁxed costs forretrieving and opening a batch of ballots and per-ballot and per-contest costs. Authors Suppressed Due to Excessive Length

We estimate

ASN ( A , α ) by simulation. We simulate the sampling of ballots,one at a time. An “overstatement” error is introduced with a pre-speciﬁedprobability e . If the sample reveals one or more overstatements, the measuredrisk (i.e., the P -value of the hypothesis that the assertion is false) increases by anamount that depends on margin m . Otherwise, the measured risk decreases byan amount that depends on m . We continue to sample ballots until the measuredrisk falls below α or until every ballot has been manually reviewed, in which casethe outcome based on the manual interpretations replaces the original reportedresults. We take the median of the number of ballots sampled over N simulationsas an estimate of ASN ( A , α ). Inaccuracy of this estimate aﬀects whether theselected assertions result in the smallest expected workload, but does not aﬀectthe risk limit. For the examples and experiments in this paper, we use e = 0 . N = 20, and a risk limit of 5%. The ﬁrst stage of the election identiﬁes the viable candidates. We introducenotation for the assertions we will use to audit viability, as follows: – V iable ( c, E, t ): Candidate c has at least proportion t of the vote after thecandidates in set E have been eliminated. This amounts to a simple majorityassertion p c > t after candidates in E are eliminated. – N onV iable ( c, E, t ): Candidate c has less than proportion t of the vote aftercandidates E have been eliminated. This amounts to a simple majority as-sertion ¯ p c > − t where ¯ p c is the proportion of valid votes for candidatesother than c after candidates E are eliminated. – IRV ( c, c ′ , E ): Candidate c has more votes than candidate c ′ after candidates E have been eliminated. This amounts to a pairwise majority assertion.If the ﬁrst stage is a plurality vote, E ≡ ∅ : the elimination in the ﬁrst stage onlyoccurs for IRV .Consider an election E = hC , B , T i with candidates C , cast ballots B , andviability threshold T ( T = 0 .

15 for the primary elections we will examine). Theoutcome of this election is a set of viable candidates, V ⊆ C , together with, in thecase of instant runoﬀ Hamiltonian elections, a sequence of eliminated candidates, π . To check that the set of candidates reported to be viable really are the viablecandidates, we test assertions that rule out all all other possibilities. Considerthe subset V ′ ⊆ C , where V ′ = V . We can demonstrate that V ′ is not thetrue set of viable candidates by showing that some candidate c ∈ V ′ does notbelong there. We can also rule out V ′ as an outcome by showing that there is acandidate c / ∈ V ′ that does in fact belong in the viable set. We aim to ﬁnd the‘least eﬀort’ set of assertions A that, if shown to hold in a risk-limiting audit,conﬁrm that (i) each candidate in V is viable, and (ii) no candidate c ′ / ∈ V isviable. The procedure used to calculate the

ASN for an assertion with margin m isavailable in the public repositories https://github.com/michelleblom/primaries andhttps://github.com/pbstark/SHANGRLA.uditing Hamiltonian Elections 9 For each viable candidate v ∈ V we need to verify the assertion V iable ( v, ∅ , T ).For each non-viable candidate n ∈ C\ V we need to verify the assertion N onV iable ( n, ∅ , T ).Let A be the union of these two sets of assertions. Note that A rules out anyother set of viable candidates V ′ = V . Example 4.

To audit the ﬁrst stage of the election of Example 1, we verify the as-sertions A = { V iable ( Ann, ∅ , . V iable ( Bob, ∅ , . N onV iable ( Cal, ∅ , . N onV iable ( Dee, ∅ , . } . The margins associated with these assertions are 4.073,0.378, 0.152, and 0.163, respectively. The expected number of ballots we needto compare to the corresponding cast vote records to audit these assertions, as-suming an overstatement error rate of 0.002 and a risk limit of α = 5%, are,respectively 1, 17, 46, and 42. The overall ASN for the audit is 46 ballots. Eﬃcient RLAs for IRV have been devised only recently [1]. To audit the ﬁrststage of an IRV Hamiltonian election we must eliminate the possibility that adiﬀerent set of candidates is viable. This means that we need to look at everyother set of candidates, and propose an assertion that will show that set is notviable.In contrast to auditing a simple IRV election, where there are |C|− M = ⌊ /T ⌋ be themaximum possible number of viable candidates. The number of possible winnersets O is O ≡ (cid:18) |C| M (cid:19) + (cid:18) |C| M − (cid:19) + · · · + (cid:18) |C| (cid:19) . We can show that a subset of candidates V ′ is not the set of viable candidatesin a number of ways: – we could show that the tally of at least one c ∈ V ′ does not reach the requiredthreshold assuming all candidates not mentioned in V ′ have been eliminated – we could show that there is a candidate c ′ / ∈ V ′ that is viable on the basis oftheir ﬁrst preferences, so any potential set of viable candidates must include c ′ – we could show that the unmentioned candidates could not have been elimi-nated in a sequence that would result in V ′ . Reducing the set of subsets

While there are many possible alternate win-ner sets V ′ , we can rule out many of these easily. We examine the assertions V iable ( w, ∅ , T ) for any candidate w ∈ C who had more than T proportion of thevote initially. This assertion will be easy to verify, as long as the proportion isnot too close to T . This assertion rules out any subset V ′ where w V ′ . Let W be the set of candidates where this assertion is expected to hold. Next we examine the assertions

N onV iable ( l, C − W − { l } , T ) for those can-didates l who are not mentioned in at least T of the ballots, when all but thedeﬁnite winners W and l are eliminated. In this case candidate l can never reach T proportion of the votes. Again this assertion is easy to verify as long as theproportion of such votes is not close to T . This assertion removes any subset V ′ where l ∈ V ′ . Let L be the set of candidates where this assertion is expected tohold.We collect together A = { V iable ( w, ∅ , T ) | w ∈ W } ∪ { N onV iable ( l, C − W −{ l } , T ) | l ∈ L } . If these assertions hold, we only need to consider subsets ofviable candidates V = { V ′ ⊆ C | W ⊆ V ′ , V ′ ∩ L = ∅} − { V } . There are only | V | subsets to further examine, where | V | = (cid:18) | C − W − L | M − | W | (cid:19) + · · · + (cid:18) | C − W − L | (cid:19) − Selecting assertions for the remaining subsets

We now need to select aset of assertions that rule out any alternate set of viable candidates V ′ ∈ V . Toform these assertions, we visualise the space of alternate election outcomes asa tree. We use a branch-and-bound algorithm to ﬁnd a set of assertions that,if true, will prune (invalidate) all branches of this tree. At the top level of thistree is a node for each possible V ′ ∈ V . Each node deﬁnes an (initially empty)sequence of candidate eliminations, π , and a set of viable candidates, V ′ . Thesenodes form a frontier, F .Our algorithm maintains a lower bound LB on the estimated auditing eﬀort(EAE) required to invalidate all alternate election outcomes, initially setting LB = 0. For each node n = ( ∅ , V ′ ) in F , we consider the set of assertionsthat could invalidate the outcome that it represents. Two kinds of assertion areconsidered at this point: – V iable ( c ′ , L, t ) for each candidate c ′ ∈ C that does not appear in V ′ , andwhose ﬁrst preference tally exceeds t proportion of the vote when only can-didates in L are eliminated; – N onV iable ( c, C \ V ′ , t ) for each candidate c ∈ V ′ whose tally, if all candidates c ′ ∈ C \ V ′ have been eliminated, falls below t proportion of the vote.We assign to n the assertion a from this set with the smallest EAE (EAE[ n ] = ASN ( { a } , α ) where we use our estimation of ASN method previously described.If no such assertion can be formed for n , we give n an EAE of ∞ , EAE[ n ] = ∞ .We then select the node in F with the highest EAE to expand.To expand a node n = ( π, V ′ ), we consider the set of candidates in C that donot currently appear in π or V ′ . We denote this set of ‘unmentioned’ candidates, U . For each candidate c ′ ∈ U , we form a child of n in which c ′ is appended to thefront of π . For instance, the node ([ c ′ ] , V ′ ) represents an outcome in which c ′ isthe last candidate to be eliminated, after which all remaining candidates, c ∈ V ′ ,have at least t = T proportion of the cast votes. All unmentioned candidates areassumed to have been eliminated, in some order, before c ′ . For each newly created uditing Hamiltonian Elections 11 node, we look for an assertion that could invalidate the corresponding outcome.Two kinds of assertion are considered to rule out an outcome n ′ = ([ c ′ | π ′ ] , V ′ ): – V iable ( c ′ , U \ { c ′ } , t ) for each candidate c ′ ∈ U that has at least t proportionof the vote in the context where candidates U \ { c ′ } have been eliminated.Candidate c ′ thus cannot have been eliminated at this point; – IRV ( c ′ , c, U \ { c ′ } ) for each candidate c ′ ∈ U that has a higher tally thansome candidate c ∈ π ′ ∪ V ′ in the context where candidates U \{ c ′ } have beeneliminated. Candidate c ′ thus cannot have been eliminated at this point.We assign to each child of n the assertion a from this set with the smallest ASN ( { a } , α ), and replace n on our frontier with its children. If neither of theabove two types of assertion can be created for a given child of n , the child islabelled with an EAE of ∞ (EAE[ n ′ ] = ∞ ). We continue to expand nodes in thisfashion until we reach a leaf node, l = ( π, V ′ ), where π ∪ V ′ = C (all candidatesare mentioned either in the elimination sequence π or in the viable set V ′ ).We assign to l an invalidating assertion of the above two kinds, if possible.We consider all the nodes in the branch that l sits on, and select the node n b associated with the least cost assertion a . We add a to our set of assertions toaudit A , prune n b and all of its descendants from the tree, and update our lowerbound on audit cost LB to max( LB , EAE[ a ]). We then look at all nodes onour frontier that can be pruned with an assertion that has an EAE LB . Weadd those assertions to A , and prune the nodes from the frontier. The algorithmstops when the frontier is empty. If we discover a branch whose best assertionhas an EAE of ∞ , the algorithm stops in failure – indicating that a manualrecount of the election is required.This branch-and-bound algorithm is a variation of that described by [1,4] forgenerating an audit speciﬁcation for an IRV election. It has been altered for thecontext where the ultimate outcome is a set of winning candidates—the viablecandidates—and not one winner, left standing after all others are eliminated. The Hamilton method for proportional representation is used to assign delegatesto viable candidates. It might appear that auditing the Hamilton method re-quires checking some delicate results, for instance, whether candidate A receivedat least 2 delegate quotas when candidate A actually received 2.001 delegatequotas. However, this is not necessary, because candidate A can receive 2 dele-gates without having at least 2 delegate quotas. For example, if A receives 1.999quotas A may still end up with 2 delegates. Our auditing method avoids checkingsuch things.The audit instead examines all pairs of viable candidates, including thosereceiving no delegates. For each pair of viable candidates n and m we checkwhether ( q n − ( a n − q m − ( a m − n is not 1 more than the quota for m , after removing all received delegates but the last. This can be equivalently rewritten as p m > p n + a m − a n − D , n, m ∈ V, n = m. (1)In the case that q m was rounded up and q n was rounded down, this captures thatthe remainder for m was greater than the remainder for n : p m D − ( a m − > p n D − a n .We show that if the delegates are wrong with respect to the true votes, thenone of these assertions is violated. Theorem 1.

Suppose the number of assigned delegates a c to each viable delegate c is incorrect, then one of the assertions of Equation 1 will be violated.Proof. Suppose a ′ c is the true number of delegates that should have been awardedto each candidate c . Since P c ∈ V a c = D and P c ∈ V a ′ c = D , and they diﬀer, theremust be at least one candidate m ∈ V , where a m > a ′ m +1, who was awarded toomany delegates, and at least one n ∈ V , where a n a ′ n −

1, who was awardedtoo few.Since a ′ m is the true number of delegates awarded to m we know that the( true ) proportion of the vote for m , p m , must be (a) p m D < a ′ m a m − m was rounded up or (b) p m D < a ′ m + 1 a m if m was rounded down. Similarly,since a ′ n is the true number of delegates awarded to n we know that either (c) p n D > a ′ n − > a n if n was rounded up, or (d) p n D > a ′ n > a n + 1 if n wasrounded down.If we add these two inequalities for combinations (a)+(c) or (b)+(d) we get p m D + a n < p n D + a m −

1. For the combination (a)+(d) we get p m D + a n + 1

1. Any of these cause the assertion p m > p n + a m − a n − D to befalsiﬁed. For the last case (b)+(c) we need a stricter comparison, which weobtain by comparing the remainders. Since m was rounded down and n wasrounded up, we know that remainder for m was less than the remainder for n , i.e., p m D − a ′ m < p n D − ( a ′ n − p m D < p n D + a ′ m − ( a ′ n − p n D + ( a m − − a n . Again the assertion p m > p n + a m − a n − D is falsiﬁed. ⊓⊔ Example 5.

Consider the delegate allocation of Example 3. Recall that the pro-portions of the qualiﬁed vote are p ann = 0 . p bob = 0 . p ann p bob + 4 / p bob p ann − /

5. These facts require much less work toprove, than for example auditing that p bob > /

5. The margins associated withthe above pairwise diﬀerence assertions are 1.1 and 0.12, respectively. Assumingan error rate of 0.002, and a risk limit of α = 5%, the ASNs associated withthese assertions are 5, and 59, ballots. We consider the set of Hamiltonian elections conducted as part of the selectionprocess for the 2020 Democratic National Convention (DNC) presidential nom-inee. Most of these primaries determine candidate viability via a plurality vote. uditing Hamiltonian Elections 13

Several states, including Wyoming and Alaska, use IRV. We estimate the numberof ballots we would need to check in a comparison audit of these primaries. Foreach of these primaries, we audit the viability of candidates on the basis of thestatewide vote, and that each viable candidate deserved the delegates that wereawarded to them. We consider only the delegates that are awarded on the basisof statewide vote totals (PLEO and at-large) as these are readily available. In each proportional DNC primary, viable candidates must attain at least 15%of the total votes cast.The full code used to generate the assertions for each DNC primary, andestimate the ASN for each audit, is located at:https://github.com/michelleblom/primariesTable 1 reports the expected number of ballot samples required to performthree levels of audit in each plurality and IRV-based primary conducted for the2020 DNC. Level 1 checks only that the reportedly viable candidates haveat least 15% of the vote, and all other candidates do not. Level 2 checks can-didate viability and that each viable candidate c , with a c allocated delegates,deserved at least a c − Party Leaders and Elected Oﬃcials A small number of DNC 2020 primaries that did not use proportional allocation ofdelegates were not considered, in addition to those for which we could not obtaindata.4 Authors Suppressed Due to Excessive Length

Table 1.

Estimated sample size required to audit viability and delegate distribution(PLEO and at-large) in all proportional (plurality or IRV-based) DNC primaries in2020 for which data was available. Levels 1, 2, and 3 audit candidate viability, thateach viable candidate deserved almost all of their allocated delegates, and that theydeserved all of their delegates, respectively. An error rate of 0.002 (an expectation of1 error per 1,000 ballots) was used in the estimation of sample sizes. A ‘–’ indicatesthat a full recount is required. The number of candidates ( |C| ) and total number ofcast ballots ( |B| ) is stated for each election.Plurality-based PrimariesASN ( α = 5%) ASN ( α = 5%)State |C| |B| Level 1 Level 2 Level 3 State |C| |B|

Level 1 Level 2 Level 3AL 15 452,093 182 182 1,352 NC 16 1,332,382 350 350 808AR 18 229,122 121 121 1,154 NE 4 164,582 925 925 925AZ 12 536,509 71 71 120 NH 34 298,377 104 104 155CA 21 5,784,364 395 1,258 3,187,080 NJ 3 958,202 4,514 4,515 4,514CO 13 960,128 42 42 – NM 7 247,880 1,812 1,812 1,812CT 4 260,750 174 174 174 NY 11 752,515 56 731 486,495DC 5 110,688 334 334 334 OH 11 894,383 61 334 –DE 3 91,682 80 80 80 OK 14 304,281 649 649 649FL 16 1,739,214 91 208 766 OR 5 618,711 111 111 191GA 12 1,086,729 107 218 218 PA 3 1,595,508 48 167 642ID 14 1,323,509 143 143 143 PR 11 7,022 412 412 412IL 12 1,674,133 44 140 620 RI 7 103,982 – – –IN 9 474,800 391 391 391 SC 12 539,263 165 165 34,546KY 11 537,905 209 209 209 SD 2 52,661 13 13 216LA 14 267,286 79 98 98 TN 16 516,250 235 235 1,203MA 18 1,417,498 185 185 832 TX 17 2,094,428 1,282 1,282 2,133MD 15 1,050,773 83 170 170 UT 15 220,582 262 262 781ME 13 205,937 189 189 189 VA 14 1,323,509 143 204 1,309MI 16 1,587,679 57 118 – VT 17 158,032 289 289 508MN 16 744,198 309 309 6,195 WA 15 1,558,776 103 127 617MO 23 666,112 44 130 – WI 14 925,065 44 144 878MS 10 274,391 – – – WV 12 187,482 213 213 213MT 4 149,973 5,159 5,159 5,159IRV-based PrimariesAK 9 19,811 88 88 88 WY 9 15,428 66 87 452 divided by the number of available delegates; and the estimated auditing eﬀort(ASN) for the primary. For the ﬁrst four primaries in the table, the last awardedat-large delegate is the hardest to audit.The use of IRV for determining candidate viability does not make a Hamilto-nian election more diﬃcult to audit. While more assertions are created to auditan IRV-based primary, the diﬃculty of any audit is based on the cost (ballotsamples required) of its most expensive assertion. Since all assertions are testedon each ballot examined, the principle cost is retrieving the correct ballot. Theaudit speciﬁcations generated for the Wyoming and Alaskan primaries contain 78 uditing Hamiltonian Elections 15 and 89 assertions, respectively. The number of assertions formed for a plurality-based primary is proportional to the number of candidates. NH, involving themost candidates at 34, has 48 assertions to audit.

Table 2.

Hard (top) and relatively easy (bottom) primaries for which to audit thelast assigned at-large delegate to each candidate. The number of at-large delegates D ;the delegate quotas for Biden and Sanders; and the diﬀerence between the remainderof their quotas (divided by D ) is reported, since this corresponds to the tightness ofequation (1). Quotas Rem.State D Biden Sanders Diﬀ. / D ASNCA 90 50.688 39.312 0.004 3.2 × MO 15 9.524 5.476 0.003 –NY 61 47.629 13.371 0.004 486,495SC 12 8.533 3.467 0.006 34,546ME 5 2.050 1.993 0.19 189AZ 14 8.010 5.990 0.07 120OR 13 9.948 3.052 0.07 191

The computational cost of generating these audit speciﬁcations is not signif-icant. On a machine with an Intel Xeon Platinum 8176 chip (2.1GHz), and 1TBof RAM, the generation of an audit speciﬁcation for Wyoming and Alaska takes0.3s and 0.4s, respectively. The time required to generate an audit for each ofthe plurality-based primaries in Table 1 ranges from 0.2ms to 0.24s (and 0.03son average).

We provide an eﬀective method for auditing delegate allocation by proportionalrepresentation (the Hamilton method), the ﬁrst we know of for elections of thiskind. Usually the audit only requires examining a small number of ballots. Thiscould be used for primary elections in the USA and other elections in Russia,Ukraine, Tunisia, Taiwan, Namibia and Hong Kong.We provide a version suitable for Democratic primaries in Alaska, Hawaii,Kansas, and Wyoming, which use a modiﬁed form where viability is decidedusing IRV.To audit these elections we deﬁned a new assertion for pairwise diﬀerencesand corresponding assorter, which may be useful for auditing other methods.

References

1. Blom, M., Stuckey, P.J., Teague, V.: Ballot-polling risk limiting audits for IRVelections. In: Krimmer, R., Volkamer, M., Cortier, V., Gor´e, R., Hapsara, M.,6 Authors Suppressed Due to Excessive LengthDuenas-Cid, U.S.D. (eds.) Proceedings of the E-Vote-ID 2018: Third InternationalJoint Conference on Electronic Voting. LNCS, vol. 11143, pp. 17–34. Springer(2018)2. Blom, M., Stuckey, P.J., Teague, V.: Computing the margin of victory in pref-erential parliamentary elections. In: Proceedings of the E-Vote-ID 2018: ThirdInternational Joint Conference on Electronic Voting. LNCS, vol. 11143, pp. 1–16.Springer (2018)3. Blom, M., Teague, V., Stuckey, P.J., Tidhar, R.: Eﬃcient computation of exact IRVmargins. In: European Conference on Artiﬁcial Intelligence (ECAI). pp. 480–488(2016)4. Blom, M.L., Stuckey, P.J., Teague, V.: Risk-limiting audits for IRV elections. CoRR abs/1903.08804 (2019), http://arxiv.org/abs/1903.088045. Kroll, J.A., Halderman, J.A., Felten, E.W.: Eﬃciently auditing multi-level elec-tions. In: Krimmer, R., Volkamer, M. (eds.) Proceedings of Electronic Voting 2014(EVOTE2014). pp. 93–101. TUT Press (2014)6. Lindeman, M., Stark, P.: A gentle introduction to risk-limiting audits. IEEE Se-curity and Privacy , 42–49 (2012)7. Lindeman, M., Stark, P., Yates, V.: BRAVO: Ballot-polling risk-limiting audits toverify outcomes. In: Proceedings of the 2011 Electronic Voting Technology Work-shop / Workshop on Trustworthy Elections (EVT/WOTE ’11). USENIX (2012)8. Sarwate, A., Checkoway, S., Shacham, H.: Risk-limiting audits and the margin ofvictory in nonplurality elections. Politics, and Policy (3), 29–64 (2013)9. Stark, P.: Conservative statistical post-election audits. Annals of Applied Statistics(2008)10. Stark, P.: Super-simple simultaneous single-ballot risk-limiting audits. In: Proceed-ings of the 2010 Electronic Voting Technology Workshop / Workshop on Trust-worthy Elections (EVT/WOTE ’10). USENIX (2010)11. Stark, P.B.: Sets of half-average nulls generate risk-limiting audits: Shangrla. In:Bernhard, M., Bracciali, A., Camp, L.J., Matsuo, S., Maurushat, A., Rønne, P.B.,Sala, M. (eds.) Financial Cryptography and Data Security. pp. 319–336. SpringerInternational Publishing, Cham (2020)12. Stark, P.B., Teague, V.: Veriﬁable European elections: Risk-limiting audits forD’Hondt and its relatives. USENIX Journal of Election Technology and Systems(JETS)1