User:Swpb/sandbox

=Ideal division of a disambiguation page= The purpose of disambiguation pages is for readers to find their target article with as little reading as possible. How many sections, then, should a dab page have, and how long should those sections be?

Suppose we have a dab page with a total of t entries, which we can divide into n sections. Section headers average a words in length, and entries average b words in length. We want to find n that results in the fewest words having to be read, on average.

Questionable assumptions

 * 1) The disambiguation page will be divided into equal-sized sections, with no sub-sections.
 * 2) Readers will first read section headers until they find the one they want, then read entries in that section until they find the one they want.
 * 3) Each entry is equally likely to be the one the reader is looking for. The position of the desired section, and of the desired entry within that section, are random.
 * 4) Section names and entries are clear and unambiguous. Once a reader reads a section name or entry, they know with 100% certainty whether it is what they want or not.

How questionable are these assumptions?

 * 1) This is not a very realistic assumption, but serves as a workable average, and the effect of different sized sections on n is not large.
 * 2) This is a good assumption
 * 3) This is a good assumption
 * 4) The strength of this assumption depends on how well subject areas are selected, and how well headers and entries are written, but it should be near 100%.

Solve
Given n sections, the average reader will have to read (n+1)/2 headers to find the one they want. They will then have to read ((t/n)+1)/2 entries to find the one they want. Thus, the average number of words that must be read is w = a*((n+1)/2) + b*(((t/n)+1)/2). To find the value of n that minimizes w, we take the derivative of w with respect to n and see where it equals 0.

The derivative of w is a/2 + bt/(2n^2). Setting this expression equal to zero and rearranging, we find n = sqrt(b/a*t).

Let's plug in some realistic numbers:
 * Section headers average a = 3 words in length
 * Entries average b = 10 words in length

Now n = sqrt(10/3)*sqrt(t) ~ 1.8*sqrt(t)

Suppose our disambiguation page has 30 entries. In that case, n ~. If we divide the dab page into n sections, the reader will have to read an average of w ~ words.