Resolution proof compression by splitting

In mathematical logic, proof compression by splitting is an algorithm that operates as a post-process on resolution proofs. It was proposed by Scott Cotton in his paper "Two Techniques for Minimizing Resolution Proof".

The Splitting algorithm is based on the following observation:

Given a proof of unsatisfiability $$\pi$$ and a variable $$x$$, it is easy to re-arrange (split) the proof in a proof of $$x$$ and a proof of $$\neg x $$ and the recombination of these two proofs (by an additional resolution step) may result in a proof smaller than the original.

Note that applying Splitting in a proof $$\pi$$ using a variable $$x$$ does not invalidates a latter application of the algorithm using a differente variable $$y$$. Actually, the method proposed by Cotton generates a sequence of proofs $$\pi_1 \pi_2 \ldots$$, where each proof $$\pi_{i+1}$$ is the result of applying Splitting to $$\pi_i$$. During the construction of the sequence, if a proof $$\pi_j$$ happens to be too large, $$\pi_{j+1}$$ is set to be the smallest proof in $$\{\pi_1, \pi_2, \ldots, \pi_j\}$$.

For achieving a better compression/time ratio, a heuristic for variable selection is desirable. For this purpose, Cotton defines the "additivity" of a resolution step (with antecedents $$p$$ and $$n$$ and resolvent $$r$$):


 * $$\operatorname{add}(r) := \max(|r|-\max(|p|, |n|), 0) $$

Then, for each variable $$v$$, a score is calculated summing the additivity of all the resolution steps in $$\pi$$ with pivot $$v$$ together with the number of these resolution steps. Denoting each score calculated this way by $$add(v, \pi)$$, each variable is selected with a probability proportional to its score:


 * $$p(v) = \frac{\operatorname{add}(v, \pi_i)}{\sum_x{\operatorname{add}(x, \pi_i)}}$$

To split a proof of unsatisfiability $$\pi$$ in a proof $$\pi_x$$ of $$x$$ and a proof $$\pi_{\neg x}$$ of $$\neg x$$, Cotton proposes the following:

Let $$l$$ denote a literal and $$p \oplus _x n$$ denote the resolvent of clauses $$p$$ and $$n$$ where $$x \in p$$ and $$\neg x \in n$$. Then, define the map $$\pi_l$$ on nodes in the resolution dag of $$\pi$$:


 * $$\pi_l(c) := \begin{cases}

c, & \text{if } c \text{ is an input} \\ \pi_l(p), & \text{if } c = p \oplus_x n \text{ and } (l = x \text{ or } x \notin \pi_l(p)) \\ \pi_l(n), & \text{if } c = p \oplus_x n \text{ and } (l = \neg x \mbox{ or } \neg x \notin \pi_l(n)) \\ \pi_l(p) \oplus_x \pi_l(p), & \text{if } x \in \pi_l(p) \text{ and } \neg x \in \pi_l(n) \end{cases} $$

Also, let $$o$$ be the empty clause in $$\pi$$. Then, $$\pi_x$$ and $$\pi_{\neg x}$$ are obtained by computing $$\pi_x(o)$$ and $$\pi_{\neg x}(o)$$, respectively.