User:Gouvernathor/Looking for the best apportionment

fr:Utilisateur:Gouvernathor/À_la_recherche_de_la_meilleure_répartition

What is the best apportionment ?

Apportionment methods
For those who don't know, an apportionment is a way to spread fairly a set and small number of tokens among several entities, based upon a score attributed to each entity. Generally, tokens are seats in a governing body, and score is a number of votes received in an election : the apportionment is the part where each party gets a number of seats after a list-based election. It is also used with population head count as score for districts within a federal state.

Usually, apportionment revolves around the notion of proportionality, so that the number of token is proportional to the score, but several factors can, sometimes intentionally, bias or hamper proportionality. The american electoral college, for example, is intentionally biased such that states are given at least 3 electors, and that each number of electors follow (more or less) an affine function of the population count, being a proportional value + 2.

In this article, the goal will be to try and find the best, the fairest, the most proportional apportionment method.

Proportional apportionment
A parliament House needs to be filled with representatives. An election took place, giving votes to a series of parties. It would seem fair to allocate the seats such that the percentage of the seats in the house, for a given party, matches (as close as possible) the percentage of votes it received. By the way, the percentage of the votes received by party A (over all counted votes for all parties) will be written $$t_A$$.

However, you can't divide seats, so at some point you will need to use approximations.

There is a theoretical, decimal number of seats for each party, which is their percentage of the votes multiplied by the total number of seats. This number will be called the fair number. For a party receiving one third of all the votes, and a House of 200 seats, that number is 66.666 seats ; for a House of 100 seats, the fair number is 33.333. For party A, it will be written $$q_A$$. Obviously, being a decimal number, it's unreachable. All the difficulty will be how to fairly round that fair value for each party. The finally alotted number of seats will be written $$a_A$$, and the total number of seats, $$h$$. So, $$q_A = t_A. h$$.

One initial idea could be to round each fair value to the closest integer, but it only works really well when you have two parties to seat, otherwise it glitches. Take the previous example with a House of 100 seats, and three parties receiving (more or less) one-third of the votes. Each party receives 33 seats... and one seat remains unallocated. More problematic, for the same votes and 200 seats total, each party receives 67 seats... and we end up with 201 seats. Ouch.

A slightly less constraining idea would be to at least require that the number of seats be always comprised between the fair value rounded down, and the fair value rounded up. That's called the quota rule.

The Quota rule
The method which was built to implement that rule is called the largest remainder, or the Hare or Hamilton method, and it has the feature of being one of the simplest to calculate by hand in real conditions - which is an undeniable democratic advantage, you have to admit it.

The Quota rule is particularly intuitive, but I will explain how it actually sucks ass is not fair at all, and really doesn't make much sense at all math-wise. To depict that more clearly, let's take the example of taxes.

Let's assume that whatever the formula chosen to find what amount you owe the country, there will be some amount of error comprised between a certain margin. Some people owe the state millions, some owe thousands, some owe nothing at all. Would it seem fair if the margin of error were ±100$, or ±1000$, i.e based upon a fix monitary value ? No, because when you owe 5000$, having a +1000$ error is much more serious, important and unfair than someone owing 50 000$ and receiving the same +1000$ error. It's much more fair to have a margin of percentage error, say ±1% of the total of what you owe. That way, for someone owing around 5000$ the error is ±50$, and for someone owing 50 000$ it's ±500$. That's still not ideal (the best would be no margin of error at all), but at least it's spread more fairly between bigger and smaller payers.

It's the same idea for apportionment. The Quota rule implies that the rounding will be at worst of ±1 seat, but 1 seat doesn't represent the same thing for bigger and for smaller parties. This rule is therefore particularly unfair in the way it settles roundings, and it's often described as settling "randomly" - which is not true strictly speaking, but in a way, sort of is.

A symptom of this quasi-randomness is that methods implementing the Quota rule will by nature break another, called House monotonicity. This other rule states that when keeping the same number of votes for each party and adding one seat to the total, no party should lose a seat. Yet, it frequently happens with the largest-remainder method, and that's called the Alabama paradox.

The big family of methods freeing themselves from the Quota rule in order to respect house monotonicity are the highest averages methods, with numerous variants of which d'Hondt/Jefferson, Webster/Sainte-Laguë and Adams are the three most well-known.

The trio at the top
I'm going to quickly describe these three methods, because it's interesting, before showing how all three are bad. So, if you only want the best apportionment, you can skip this section which is a bit more hairy math-wise.

These three methods use (in the way I chose to explain them, at least) a metric sometimes called the "advantage ratio", which is the percentage of the seats controlled by a party ($$a_A / h$$), divided by the percentage of the votes it received ($$t_A$$). The larger it is, the larger than 1, the more the party received "extraneous" seats as compared to its fair number. The advantage ratio is in fact equal to $$a_A / q_A$$.

The d'Hondt method gives each party the fair number of seats rounded down, then allocates the remaining seats such that the advantage ratio of the largest parties be the smallest possible. It results in a big advantage for larger parties : in fact, it was demonstrated that among the divisor methods (a math category of methods which these three are a part of, I won't go into further details), this one favors larger parties the most.

The Adams method sort of does the opposite : it gives each party the fair number rounded up, then takes seats away such that the advantage ratio of the smallest parties be the largest possible. An interesting side-effect is that when a party receives even a single vote, its value rounded up is always at least one seat... this method is therefore usually added a vote threshold or a math shenanigan for the first seat, which automatically furthers it away from the idea of proportionality.

The Webster method is the fairest : each party will be given its fair number, rounded... to the closest integer. Then, by adding or removing seats depending on whether the rounding made us overshoot the target number of seats ($$h$$) or not, the mean of each party's integer ratio will be minimized or maximized (respectively).

The built-in issue
The issue with these three methods, along with most if not all known highest-averages methods, is precisely that advantage ratio formula, which in a way furthers the same bottom issue as the Quota rule. The rule is not dumb enough anymore to say "no more than one missing or extraneous seat", but it still has a built-in error, inherited from the fact that simply comparing the percentage of controlled seats is not fair.

For party A with $$q_A = 49.999$$ fair theoretical seats and receiving $$a_A = 49$$ seats, and party B with $$q_B = 2.8$$ fair and receiving $$a_B = 2$$ seats, the rounding is more serious in the case of B than in the case of A. A lost $$q_A - a_A = .999$$ seat, but it lost $$1/50 = 2%$$ of its theoretical fair share, whereas the $$a_B - a_B = .8$$ seat lost by B represents $$2/7 \approx 28.57%$$ of its theoretical fair share, which is considerably worse.

In this example as in the taxes example above, I showed smaller entities being bullied by the calculation method, but it goes the other way around as well : if the parties were allocated more than their fair share, the calculation would allow for a much larger overshoot in the small parties case than for bigger parties, which would be unfair for the bigger ones this time. In fine, it's not that the method favors bigger or smaller parties, it's that it rounds smaller parties in a much more clumsy way than it does the bigger ones - be it by rounding up or down, which in the end is unfair for everyone.

The fairest method would therefore be to optimize not on the percentage of seats in the assembly, but on the percentage of seats you got over your fair number.

Proportionality in the error
Let's introduce another metric : the raw error is the number of seats you get, minus your fair theoretical number of seats. It will be written $$e_A = a_A - q_A = a_A - t_A.h$$. It represents the number of seats you were owed but lack, or if positive, the number of seats you have in excess of your theoretical entitlement. To a factor equal to $$h$$ and therefore fixed, it's proportional - therefore equivalent to optimize - with $$e'_A = a_A/h - t_A$$, which is the percentage of the seats allocated to party A minus the percentage of the votes it got.

The raw error is roughly alike to the advantage ratio, in that it's still expressed as a number of seats, but as we saw, one seat or even one tenth of a seat does not mean the same thing to a party entitled to 150 as it does a party entitled to 20.

Even when taking the second version of the raw error I expressed above ($$e'_A = a_A/h - t_A$$), it's still expressed as a percentage of the whole number of seats, and by the same reasoning, 1% of the whole assembly does not mean the same thing for someone controlling 45% of it than to someone controlling 5%. To take back the example with taxes, expressing the margin of error as a number of dollars or as a percentage of the total number of dollars in circulation is exactly the same thing. As opposed to a percentage of what you owe / are entitled to.

So, what would actually be fair is to take something proportional to that : to $$q_A$$, the number of seats a party is entitled to (or to $$t_A$$, the percentage equivalent). So, let's make a new metric, the proportional error, written $$p_A = e_A / q_A = (a_A - q_A) / q_A = (a_A - t_A.h) / (t_A.h) = (a_A/h - t_A) / t_A = e'_A / t_A$$ (the second version using $$e'_A$$ is usually faster to compute when implementing it). The rationale of this proportional error is this : the more seats you are entitled to, the less giving you one more or taking one away from you matters.

Optimizing this value for all parties - either the mean value across all parties or the sum across all parties, the only difference is the number of parties which is a constant after the vote takes place - is the way forward if you want the optimal, fairest representation of the will the people expressed through their vote.