Benders decomposition

Benders decomposition (or Benders' decomposition) is a technique in mathematical programming that allows the solution of very large linear programming problems that have a special block structure. This block structure often occurs in applications such as stochastic programming as the uncertainty is usually represented with scenarios. The technique is named after Jacques F. Benders.

The strategy behind Benders decomposition can be summarized as divide-and-conquer. That is, in Benders decomposition, the variables of the original problem are divided into two subsets so that a first-stage master problem is solved over the first set of variables, and the values for the second set of variables are determined in a second-stage subproblem for a given first-stage solution. If the subproblem determines that the fixed first-stage decisions are in fact infeasible, then so-called Benders cuts are generated and added to the master problem, which is then re-solved until no cuts can be generated. Since Benders decomposition adds new constraints as it progresses towards a solution, the approach is called "row generation". In contrast, Dantzig–Wolfe decomposition uses "column generation".

Methodology
Assume a problem that occurs in two or more stages, where the decisions for the later stages rely on the results from the earlier ones. An attempt at first-stage decisions can be made without prior knowledge of optimality according to later stage decisions. This first-stage decision is the master problem. Further stages can then be analyzed as separate subproblems. Information from these subproblems is passed back to the master problem. If constraints for a subproblem were violated, they can be added back to the master problem. The master problem is then re-solved.

The master problem represents an initial convex set which is further constrained by information gathered from the subproblems. Because the feasible space only shrinks as information is added, the objective value for the master function provides a lower bound on the objective function of the overall problem.

Benders Decomposition is applicable to problems with a largely block-diagonal structure.

Mathematical Formulation
Assume a problem of the following structure:


 * $$ \begin{align}

& \text{minimize} && \mathbf{c}^\mathrm{T}\mathbf{x} + \mathbf{d}^\mathrm{T}\mathbf{y} \\ & \text{subject to} && A \mathbf{x} + B \mathbf{y} \geq \mathbf{b} \\ & && \mathbf{y} \in Y \\ & && \mathbf{x} \geq \mathbf{0} \end{align}$$

Where $$A, B$$ represent the constraints shared by both stages of variables and $$Y$$ represents the feasible set for $$\mathbf{y}$$. Notice that for any fixed $$\mathbf{\bar{y}} \in Y$$, the residual problem is


 * $$ \begin{align}

& \text{minimize} && \mathbf{c}^\mathrm{T}\mathbf{x} + \mathbf{d}^\mathrm{T}\mathbf{\bar{y}} \\ & \text{subject to} && A \mathbf{x} \geq \mathbf{b} - B \mathbf{\bar{y}} \\ & && \mathbf{x} \geq \mathbf{0} \end{align} $$

The dual of the residual problem is


 * $$ \begin{align}

& \text{maximize} && (\mathbf{b} - B \mathbf{\bar{y}})^\mathrm{T} \mathbf{u} + \mathbf{d}^\mathrm{T}\mathbf{\bar{y}} \\ & \text{subject to} && A^\mathrm{T} \mathbf{u} \leq \mathbf{c} \\ & && \mathbf{u} \geq \mathbf{0} \end{align} $$

Using the dual representation of the residual problem, the original problem can be rewritten as an equivalent minimax problem



\min_{\mathbf{y} \in Y} \left[ \mathbf{d}^\mathrm{T}\mathbf{y} + \max_{\mathbf{u} \geq \mathbf{0}} \left\{ (\mathbf{b} - B \mathbf{y})^\mathrm{T} \mathbf{u} \mid A^\mathrm{T} \mathbf{u} \leq \mathbf{c} \right\} \right]. $$

Benders decomposition relies on an iterative procedure that chooses successive values of $$\mathbf{y}$$ without considering the inner problem except through a set of cut constraints that are created through a pass-back mechanism from the maximization problem. Although the minimax formulation is written in terms of $$(\mathbf{u}, \mathbf{y})$$, for an optimal $$\mathbf{\bar{y}}$$ the corresponding $$\mathbf{\bar{x}}$$ can be found by solving the original problem with $$\mathbf{\bar{y}}$$ fixed.

Master Problem Formulation
The decisions for the first stage problem can be described by the smaller minimization problem


 * $$ \begin{align}

& \text{minimize} && \mathbf{z} \\ & \text{subject to} && \{\text{cuts}\} \\ & && \mathbf{y} \in Y \\ \end{align}$$

Initially the set of cuts is empty. Solving this master problem will constitute a "first guess" at an optimal solution to the overall problem, with the value of $$\mathbf{z}$$ unbounded below and $$\mathbf{y}$$ taking on any feasible value.

The set of cuts will be filled in a sequence of iterations by solving the inner maximization problem of the minimax formulation. The cuts both guide the master problem towards an optimal $$\mathbf{y}$$, if one exists, and ensure that $$\mathbf{y}$$ is feasible for the full problem. The set of cuts define the relationship between $$\mathbf{y}$$, $$\mathbf{z}$$, and implicitly $$\mathbf{x}$$.

Since the value of $$z$$ starts unconstrained and we only add constraints at each iteration, meaning the feasible space can only shrink, the value of the master problem at any iteration provides a lower bound on the solution to the overall problem. If for some $$\mathbf{\bar{y}}$$ the objective value of the master problem is equal to the value of the optimal value of the inner problem, then by duality theory the solution is optimal.

Subproblem Formulation
The subproblem considers the suggested solution $$\mathbf{\bar{y}}$$ to the master problem and solves the inner maximization problem from the minimax formulation. The inner problem is formulated using the dual representation


 * $$ \begin{align}

& \text{maximize} && (\mathbf{b} - B \mathbf{\bar{y}})^\mathrm{T} \mathbf{u} + \mathbf{d}^\mathrm{T}\mathbf{\bar{y}} \\ & \text{subject to} && A^\mathrm{T} \mathbf{u} \leq \mathbf{c} \\ & && \mathbf{u} \geq \mathbf{0} \end{align} $$

While the master problem provides a lower bound on the value of the problem, the subproblem is used to get an upper bound. The result of solving the subproblem for any given $$\mathbf{\bar{y}}$$ can either be a finite optimal value for which an extreme point $$\mathbf{\bar{u}}$$ can be found, an unbounded solution for which an extreme ray $$\mathbf{\bar{u}}$$ in the recession cone can be found, or a finding that the subproblem is infeasible.

Procedure
At a high level, the procedure will iteratively consider the master problem and subproblem. Each iteration provides an updated upper and lower bound on the optimal objective value. The result of the subproblem either provides a new constraint to add to the master problem or a certificate that no finite optimal solution exists for the problem. The procedure terminates when it is shown that no finite optimal solution exists or when the gap between the upper and lower bound is sufficiently small. In such a case, the value of $$\mathbf{\bar{x}}$$ is determined by solving the primal residual problem fixing $$\mathbf{\bar{y}}$$.

Formally, the procedure begins with the lower bound set to $$-\inf$$, the upper bound set to $$\inf$$, and the cuts in the master problem empty. An initial solution is produced by selecting any $$\mathbf{\bar{y}} \in Y$$. Then the iterative procedure begins and continues until the gap between the upper and lower bound is at most $$\epsilon$$ or it is shown that no finite optimal solution exists.

The first step of each iteration begins by updating the upper bound by solving the subproblem given the most recent value of $$\mathbf{\bar{y}}$$. There are three possible outcomes from solving the subproblem.

In the first case, the objective value of the subproblem is unbounded above. By duality theory, when a dual problem has unbounded objective the corresponding primal problem is infeasible. This means that the choice of $$\mathbf{\bar{y}}$$ does not satisfy $$A \mathbf{x} + B \mathbf{\bar{y}} \geq \mathbf{b}$$ for any $$\mathbf{x} \geq \mathbf{0}$$. This solution can be removed from the master problem by taking an extreme ray $$\mathbf{\bar{u}}$$ that certifies the subproblem has unbounded objective and adding a constraint to the master asserting that $$(\mathbf{b} - B \mathbf{y})^\mathrm{T} \mathbf{\bar{u}} \leq \mathbf{0}$$.

In the second case, the subproblem is infeasible. Since the dual feasible space to the problem is empty, either the original problem is not feasible or there is a ray in the primal problem that certifies the objective value is unbounded below. In either case, the procedure terminates.

In the third case, the subproblem has a finite optimal solution. By duality theory for linear programs, the optimal value of the subproblem is equal to the optimal value of the original problem constrained on the choice of $$\mathbf{\bar{y}}$$. This allows the upper bound to be updated to the value of the optimal solution of the subproblem, if it is better than the current upper bound. Given an optimal extreme point $$\mathbf{\bar{u}}$$, it also yields a new constraint that requires the master problem to consider the objective value under this particular solution by asserting that $$ z \geq (\mathbf{b} - B \mathbf{y})^\mathrm{T} \mathbf{\bar{u}} + \mathbf{d}^\mathrm{T}\mathbf{y} $$. This will strictly increase the value of $$z$$ at the solution $$\mathbf{\bar{y}}$$ in the master problem if the choice of $$\mathbf{\bar{y}}$$ was suboptimal.

Finally, the last part of each iteration is creating a new solution to the master problem by solving the master problem with the new constraint. The new solution $$(\mathbf{\bar{y}}, z)$$ is used to update the lower bound. If the gap between the best upper and lower bound is less than $$\epsilon$$ then the procedure terminates and the value of $$\mathbf{\bar{x}}$$ is determined by solving the primal residual problem fixing $$\mathbf{\bar{y}}$$. Otherwise, the procedure continues on to the next iteration.