Talk:Howland will forgery trial

Original Research
I have added the original research template to this page because the section on a modern baysian perspective is phrased so that it sounds as if it is original research. If it is not, it should specifically mention which work the analysis is from. Starlord 13:28, 21 November 2006 (UTC)

I didn't think of it as such when I wrote it, as it's a straightforward application of Bayesian principles, but I suspect you're right. Can we move it to this talk page - I still think it's of interest. Blaise 20:45, 1 December 2006 (UTC)

A modern Bayesian analysis
In attempting to understand Pierce's argument more deeply it is helpful to try to replicate it using a modern statistical analysis. It is interesting to consider how his figure of "1 divided by 2,666,000,000,000,000,000,000" or :$$3.75 \times 10^{-22}$$ was arrived at. One possibility is that it is the joint probability of 30 independent events—the downstroke matches—each of which has probability 1 in 5 (taken from the proportion found in the sample). However, 1/5 to the 30th power is not :$$3.75 x 10^{-22}$$ but :$$1.07 x 10^{-21}$$ or 1 in :$$9.31 x 10^{20}$$, which seems to be a substantial error. If Peirce simply used the argument above, raising 1/5 to the 30th power (which seems unlikely) then it is an approximate calculation of a Bayes factor, with the approximation being made that the proportion of downstroke matches in a collection of true signatures is exactly 1 in 5. However, this proportion was found from a sample of 42 signatures and is thus subject to some sampling error. A modern Bayesian analysis will take this uncertainty into account, yielding a slightly different answer.

We start by considering what we know about the downstroke match proportion before we see the data, i.e. we first capture any relevant contextual information that is available. This contextual information has to represented as a probability distribution, which we called the prior distribution. In this case we have no such contextual information, so we assign a vague prior distribution to the downstroke match proportion. This means that everything we know about the downstroke match proportion will come from the sample of 42 signatures.

A suitable prior distribution would be a beta distribution, which is a distribution that sits on the unit interval [0,1] and thus is useful for representing proportions. The beta distribution is defined by two parameters, alpha and beta. For setting up a vague prior there are three choices of alpha and beta that are widely use. We will set alpha = beta = 1, which corresponds to a uniform distribution, that is, all possible values of the match proportion from 0 to 1 are considered equally likely before we see tha data from the 42 signatures. Other possibilities would be the improper prior alpha = beta = 0, or the Jeffreys prior alpha = beta = 0.5. (The result does not depend greatly on which is chosen.) The evidence is equivalent to saying that, of the 30 times 42 = 1260 downstroke events, 1 in 5 of them are matches, i.e. there are 252 matches and 1008 non-matches.

The next stage is to multiply the prior by the likelihood, then normalise the result to lie in the interval [0,1]. The result is another probability distribution, called the posterior distribution. This will be the distribution which tells us everything we know so far about the match proportion. The beta distribution is the natural conjugate prior to the binomial, which means that the posterior is another beta distribution. In this case, for a binomial likelihood with 252 matches and 1008 non-matches the posterior will be a beta with parameters (252 plus alpha) and (1008 plus beta). See the figure for a plot of this posterior distribution. It has a fairly sharp peak near 0.2, but it is not of zero width. Assuming that the match proportion was exactly 1 in 5 would be to approximate the peak by a spike of zero width at x = 0.2.

Having obtained this posterior distribution, the second stage of the calculation is to compute the probability of observing r = 30 matches, assuming a binomial distribution with N = 30 and a success probability which is unknown, but which follows the previously calculated beta posterior distribution. This is given by averaging over all possible values of the match proportion, but with the probabilities found from the previous posterior. This is



p(r = 30 | \theta)= \frac{1}{B(252.5,1008.5)}\int_0^1 \theta^{30} \theta^{251.5}(1 - \theta)^{1007.5}d\theta = 4.153092037700561 \times 10^{-21} .$$ where B(a, b) is the beta function.

This gives the probability that 30 matches would be observed, given that the signature on the codicil is genuine. It can be expressed in the form of odds, as Peirce did, as



\frac{1}{4.153092037700561 \times 10^{-21}} = 2.40784454311 \times 10^{20} .$$

In view of the similarity of this result to Peirce's reported result, it is likely that he did a Bayesian calculation similar to this one. He may have used a prior other than those listed above. Alternatively, given the computing tools available in 1868, he may have approximated the integral, or he may simply have made an error in its calculation.

As it stands, however this argument misses the point. It is an example of what is often regarded as the prosecutor's fallacy. The Peirces' analysis attempts to calculate the probability that two signatures would display such a degree of similarity given that they were genuine:


 * PPeirce=P(30 coincident downstrokes|genuine signature).

However, what is relevant to the court is the probability that the signatures are genuine given their similarity:


 * PGenuine signature=P(genuine signature|30 coincident downstrokes).

To relate the two probabilities requires the use of Bayes' theorem:


 * PGenuine signature α PPeirce×P(genuine signature),

where P(genuine signature) is the probability that the signature is genuine given all the other evidence in the case.

Under the alternate hypothesis that the signature was a traced copy of the signature on the first page, the number of downstroke matches would be 30 i.e. the probability of 30 matches is 1.

Bayesian statistics uses a measure of evidence called the Bayes factor which is the probability of seeing the observed data if hypothesis of interest is true, divided by the probability of seeing the observed data if the hypothesis is false. Thus the numerator and the denominator of the Bayes factor are known. The posterior odds are obtained by multiplying the Bayes factor by the prior odds.

Suppose we assume that the two hypotheses are equally likely a priori (odds of 1:1), the odds against the hypothesis that the signature is genuine are 1.



1 \times 2.40784454311 \times 10^{20} .$$

The factor of 1:1 is our prior estimate (before seeing the data) about the provenance of document, which can never be entirely removed from the problem. In this case the evidence is very powerful indeed. If our prior belief was that the odds against the codicil being fake were one million to one we would still arrive at the posterior conclusion, after factoring in the downstroke evidence, that the odds were :$$ 2.40784454311 \times 10^{14} $$ to one against the codicil being genuine. —The preceding unsigned comment was added by BlaiseFEgan (talk • contribs) 09:44, 9 February 2007 (UTC).

In my opinion there are 2 problems with this research. Number 1 is that it doesn't take account of the probability of a particular downstroke overlapping (presumably some are more likely to overlap than others - and therefore more likely to coincide). Number 2 is that it doesn't take into account that if the deceased signed two pages one after another then the signatures could be more likely to be similar to each other than two random signatures made at different times. —Preceding unsigned comment added by 88.212.16.30 (talk) 20:08, 7 March 2008 (UTC)