Risk-limiting audit

A risk-limiting audit (RLA) is a post-election tabulation auditing procedure which can limit the risk that the reported outcome in an election contest is incorrect. It generally involves (1) storing voter-verified paper ballots securely until they can be checked, and (2) manually examining a statistical sample of the paper ballots until enough evidence is gathered to meet the risk limit.

Advantages of an RLA include: samples can be small and inexpensive if the margin of victory is large; there are options for the public to watch and verify each step; and errors found in any step lead to corrective actions, including larger samples, up to a 100% hand count if needed. Disadvantages include: the sample needs to be a large fraction of all ballots to minimize the chance of missing mistakes, if any contest is close; and it is hard to check computer totals publicly, except by releasing computer records to the public. If examining sampled ballots shows flaws in ballot storage, the usual approach cannot recover correct results, and researchers recommend a re-vote if the number of ballots held in flawed storage is enough to change winners. An alternative to re-votes is to create and verify backups of the paper ballots soon after they are voted, so there is an alternative to flawed storage of the original ballots.

As with other election audits, the goal is to identify not only intentional alterations of ballots and tallies, but also bugs in election machines, such as software errors, scanners with blocked sensors or scanners skipping some ballots. The approach does not assume that all ballots, contests or machines were handled the same way, in which case spot checks could suffice. The sample sizes are designed to have a high chance of catching even a brief period when a scratch or fleck of paper blocks one sensor of one scanner, or a bug or hack switches votes in one precinct or one contest, if these problems affect enough ballots to change the result.

Comparisons can be done ballot-by-ballot or precinct-by-precinct, though the latter is more expensive.

Categories of audits
There are three general types of risk-limiting audits. Depending on the circumstances of the election and the auditing method, different numbers of ballots need to be hand-checked. For example, in a jurisdiction with 64,000 ballots tabulated in batches of 500 ballots each, an 8% margin of victory, and allowing no more than 10% of any mistaken outcomes to go undetected, method 1, ballot comparison, on average, needs 80 ballots, method 2, ballot polling, needs 700 ballots, and method 3, batch comparison, needs 13,000 ballots (in 26 batches). The methods are usually used to check computer counts, but methods 2 and 3 can also be used to check accuracy when the original results were hand-counted. The steps in each type of risk-limiting audit are: All methods require:
 * 1) Ballot comparison. Election computers provide their interpretation of each ballot ("cast vote record"); humans check computers' "cast vote records" against stored physical ballots in a random sample of ballots; an independent computer tabulates all "cast vote records" independently of earlier tabulations to get new totals; humans report any differences in interpretations and total tallies.
 * 2) Ballot polling. Humans count a random sample of ballots; humans report any difference between manual percentage for the sample and computer percentage for the election.
 * 3) Batch comparison. Election results provide total for each batch of ballots (e.g. precinct); in a random sample of batches humans hand-count all ballots; for 100% of batches humans check by manual addition or independent computer if the election's initial summation of batches was correct; humans report any difference between original tallies and audit tallies.
 * Procedure to re-count all paper ballots more accurately if errors are detected. This is usually planned as a 100% manual count, but could involve fixing or replacing erroneous computers, doing a new computer count, and auditing that, until an audit shows no problem.
 * Auditing all types of ballots, including military, absentee, provisional, etc.
 * Clarifying which contests were audited and which were not, or auditing all contests or a large enough random sample of contests so the chance of missing erroneous results is acceptably low.
 * Auditing a large enough random sample of ballots so the chance of missing mistakes is acceptably low.
 * Selecting a random sample after initial results are public, because telling hackers in advance which contests and ballots will be in the sample, lets them freely hack other contests and ballots.
 * Selecting the random sample before results are final, so errors can be fixed.
 * Doing the manual check immediately when the sample is selected; if insiders have altered computer files, they could use any delay to change sampled ballots to match the erroneous computer files, thus hiding the errors.
 * Having enough security on the ballots during transportation and storage, so neither insiders nor outsiders can change them.
 * Having enough independent participants select different digits of the random number seed, so no one can control the seed and hence the random number series which selects the random sample.
 * Having the public see all steps, including the content of ballots and computer records while officials examine them, to know they are counted accurately.

The last three items are hard in one-party states, where all participants may be swayed by the ruling party.

Hand-checking ballots (method 1) identifies bugs and hacks in how election computers interpret each ballot, so computer processing can be improved for future elections. Hand-counting ballots (methods 2 and 3) bypasses bugs and hacks in computer counts, so it does not identify exactly what mistakes were made. Independently totaling cast vote records (method 1) or batch totals (method 3) identifies bugs and hacks in how election computers calculate totals. Method 2 does not need this independent totaling step, since it has a large enough sample to identify winners directly.

Colorado uses method 1 in most counties, and method 2 in a few counties which use election machines which do not record and store "cast vote records". Colorado uses no audit method in two counties which hand-count ballots in the first place.

Risk-limiting audits are a results audit to determine if votes were tabulated accurately, not a process audit, to determine if good procedures were followed.

Implementation
The process starts by selecting a "risk limit", such as 9% in Colorado, meaning that if there are any erroneous winners in the initial results, the audit will catch at least 91% of them and let up to 9% stay undetected and take office. Another initial step is to decide whether to audit: all contests; a random sample of contests, allowing a known risk that erroneous winners will take office; or a non-random sample, so no statistical confidence is available on the non-audited contests. Based on a formula, a sample size is determined for each contest being audited. The size of the sample depends primarily on the margin of victory in the targeted contest.

A random starting point (seed) is chosen by combining information from multiple independent people, to create a series of random numbers identifying specific ballots to pull from storage, such as the 23rd, 189th, 338th, 480th ballots in precinct 1, and other random numbers in other precincts.

When storage is opened, records are checked to see if each sampled precinct still has the same number of ballots recorded during the election, if correct numbers appear on seals, if machines or containers have been tampered with in any way, and/or other methods to check if ballots have avoided intrusion. If ballots have not been stored successfully, advocates of risk-limiting audits say there should be a re-vote, or no result should be declared, which usually requires a re-vote, or results can be declared if "the number of questionable or missing audit records is small enough that they cannot alter the outcome of the contest." However, if storage or records are flawed, laws may require initial results to be accepted without audit. To provide an alternative to a re-vote, seven Florida counties back up the paper ballots by copying them the day after they are voted, with machines independent of election machines. While any copy can have flaws, comparing cast vote records to these independent backup copies would give an alternative to re-voting or skipping the audit when storage is not trustworthy. Florida does not hand-check this backup, which would be required by a risk-limiting audit. Instead Florida machine-audits 100% of votes and contests. They have found discrepancies of 1-2 ballots from official machines. Maryland has a less safe alternate approach. Maryland's election machines create and store ballot images during the election, separate from the cast vote records. Most election machines do so. Maryland compares cast vote records to these ballot images from the same election machines. Unlike Florida, this approach is not an independent backup or check. A hack or bug in the election machine can alter, skip, or double-count both image and cast vote record simultaneously. Maryland's semi-independent checking is better than no checking, since it has found and resolved discrepancies, such as folded ballots leaving fold lines on the images, which computers interpreted as write-in votes; sensor flaws which left lines on the images, interpreted as overvotes; and double-feeds where two ballots overlap in the scanner, and one is uncounted.

When an audit produces the same result as initial election results, the outcome is confirmed, subject to the risk limit, and the audit is complete. If the audit sample shows enough discrepancies to call the outcome into question, a larger sample is selected and counted. This process can continue until the sample confirms the original winner, or a different winner is determined by hand-counting all ballots.

Sample size
Sample sizes rise rapidly for narrow margins of victory, with all methods. In a small city or county, with 4,000 ballots, method 1, ballot comparison, would need 300 ballots (300–600 minutes, as discussed in Cost below) for a contest with a 2% margin of victory. It would need 3,000 ballots (50-100 staff hours in the city or county) for a 0.1% margin of victory. Method 2 or 3, ballot polling or batch comparison, would need a full hand count of the 4,000 ballots (70-130 staff hours). Margins under 0.1% occur in one in sixty to one in 460 contests.

Large numbers of contests on a ballot raise the chances that these small margins and large samples will occur in a jurisdiction, which is why no place does risk-limiting audits on all contests, leaving most local government races unaudited, though millions of dollars are at stake in local spending and land use decisions. Colorado picks contests with wider margins to avoid large samples. California's rules for 2019–2021 require any RLA to audit all contests, and no election offices have chosen to use RLAs under these rules.

The power of the sample also depends on staff expanding the audit after any discrepancy, rather than dismissing it as a clerical error, or re-scanning problematic ballots to fix just them.

When Maryland evaluated audit methods, it noted that local boards of elections could not budget, or plan staffing, for risk-limiting audits, since the sample "is highly dependent on the margin of victory in any given audited contest... A very close margin of victory could... require days of staff work, possibly compromising the local certification deadline."

An alternative to large samples is to audit an affordable sample size, and let the risk limit vary instead of the sample size. For a fixed sample, closer margins of victory would have more risk of letting erroneous winners take office, but any substantial sample would still have a known substantial chance of catching errors. Election managers would announce the level of confidence provided by the sample, and would have procedures to follow up if the sample finds one or more errors.

The sample sizes presented will be enough to confirm a result, subject to the risk limit, when the apparent winner is the actual winner. If the sample does not confirm the win, more ballots are sampled, up to a 100% hand count to confirm a different winner.

Ballot transport and storage
Ballots are at risk when being transported from drop boxes and polling places to central locations, and may be protected by GPS tracking, guards, security systems, and/or a convoy of the public.

No US state has adequate laws on physical security of the ballots. Security recommendations for elections include: starting audits as soon as possible after the election, regulating access to ballots and equipment, having risks identified by people other than those who design or manage the storage, using background checks and tamper-evident seals. However seals on plastic surfaces can typically be removed and reapplied without damage.

Experienced testers can usually bypass all physical security systems. Security equipment is vulnerable before and after delivery. Insider threats and the difficulty of following all security procedures are usually under-appreciated, and most organizations do not want to learn their vulnerabilities.

Method 1 requires the ballots to be kept in strict order so one can compare the computer interpretations of sampled ballots with those exact physical ballots. If the correct ballots are present, but out of order, method 2 can be used. Maryland, like other states, randomizes the order of paper ballots and cast vote records to protect ballot secrecy, so method 1 cannot be done there, since paper ballots and cast vote records cannot be compared.

Public monitoring
All the methods, when done for a state-wide election, involve manual work throughout the state, wherever ballots are stored, so the public and candidates need observers at every location to be sure procedures are followed. However, in Colorado and most states the law does not require any of the audit work to be done in public.

Software dependence
All methods are designed to be independent of the election software, to ensure that an undetected error in the election software can be found by the audit. The audit in practice is dependent on its own software, separate from the election system. Election staff examine ballots and enter staff interpretations into an online software tool, which is supposed to handle the comparison to the voting system interpretation, report discrepancies, and tell staff whether to sample further. It is also hard to prepare the list of ballots to sample from (ballot manifest) without using information from the election system.

Independent totals
Method 1, ballot comparison, requires a second step, besides checking the sample of ballots: 100% of the computer interpretations of ballots ("cast vote records") need to be re-tabulated by computers independent of the original election computers. This re-tabulation checks whether election computers tallied the cast vote records correctly. Like any computer step this independent tally is subject to hacks and bugs, especially when voting rules are complex, such as variations in the number of candidates from different districts to vote for. The reason for the re-tabulation step is that independently programming a different kind of machine provides an independent check on official election machines.

While all methods require physical security on the paper ballots, method 1 also requires enough security on the cast vote records so no one can change them. This can be accomplished by computer-calculating, storing and comparing a hash code for each file of cast vote records: (a) right after the election, (b) when independent tabulation is done, and (c) when ballot comparison is done.

Colorado says it has a system to do the independent count of cast vote records, but it is not yet publicly documented, so the chance of bugs or hacks affecting this independent computer at the Secretary of State's office along with one or more of the election machines is unknown.

California's process for risk-limiting audits omits the step of independent totals. When it did a pilot, independent totals were calculated by a student on a university computer.

Cost
Cost depends on pay levels and staff time needed, recognizing that staff generally work in teams of two or three (one to read and one or two to record votes). Teams of four, with two to read and two to record are more secure and would increase costs.

Each minute per vote checked means 25 cents per vote at $15/hour, or $250 per thousand votes. Checking random ballots can take more time: pulling individual ballots from boxes and returning them to the same spot. It is relevant to methods 1 and 2.

State variations
As of early 2017, about half the states require some form of results audit. Typically, these states prescribe audits that check only a small flat percentage, such as 1%, of voting machines. As a result, few jurisdictions have samples large or timely enough to detect and correct tabulation errors before election results are declared final.

In 2017, Colorado became the first state to implement ballot comparison audits, auditing one contest, not randomly chosen, in each of 50 of its 64 counties, several days after the election. Following the 2018 General Election, Colorado will conduct audits in the 62 of its 64 counties that use automated vote counting equipment (the two remaining counties hand count the ballots).

Rhode Island passed legislation requiring that state's Board of Elections to implement risk-limiting audits beginning in 2018. Individual jurisdictions elsewhere may be using the method on the local election clerks' initiative.

Endorsements
In 2018 the American Statistical Association, Brennan Center for Justice, Common Cause, Public Citizen and several election integrity groups endorsed all three methods of risk-limited audits. Their first five criteria are:
 * 1) EXAMINATION OF VOTER-VERIFIABLE PAPER BALLOTS: Audits require human examination of voter-marked paper ballots –  the ground truth of the election. Voter-marked paper ballots may be marked by hand or by ballot marking device. Audits cannot rely on scanned images or machine interpretations of the ballots to accurately reflect voter intent.
 * 2) TRANSPARENCY: Elections belong to the public. The public must be able to observe the audit and verify that it has been conducted correctly, without interfering with the process.
 * 3) SEPARATION OF RESPONSIBILITIES: Neither the policy and regulation setting for the audit, nor the authority to judge whether an audit has satisfied those regulations, shall be solely in the hands of any entity directly involved with the tabulation of the ballots or the examination of ballots during the audit.
 * 4) BALLOT PROTECTION: All the ballots being tabulated and audited must be verifiably protected from loss, substitution, alteration or addition.
 * 5) COMPREHENSIVENESS: All jurisdictions and all validly cast ballots, including absentee, mail-in and accepted provisional ballots, must be taken into account. No contest should be excluded a priori from auditing, although some contests may be prioritized.

In 2014, the Presidential Commission on Election Administration recommended the methods in broad terms:
 * "Commission endorses both risk-limiting audits that ensure the correct winner has been determined according to a sample of votes cast, and performance audits that evaluate whether the voting technology performs as promised and expected."

By selecting samples of varying sizes dictated by statistical risk, risk-limiting audits eliminate the need to count all the ballots to obtain a rapid test of the outcome (that, is, who won?), while providing some level of statistical confidence.

In 2011, the federal Election Assistance Commission initiated grants for pilot projects to test and demonstrate the method in actual elections.

Professor Phillip Stark of the University of California at Berkeley has posted tools for the conduct of risk-limiting audits on the university's website.