Draft:Gradient-based repair differential evolution

The Algorithm G-Demi, an acronym for Gradient-based Differential Evolution for Mixed-Integer Nonlinear Programming, is a variant of the differential evolution designed to solve mixed-integer nonlinear programming problems (MINLP). The presence of both continuous and discrete variables, along with constraints, leads to discontinuous feasible parts of varying sizes in the search space. Traditional evolutionary algorithms face difficulties with these problems due to their insensitivity in handling constraints, leading to the generation of many infeasible solutions. G-DEmi addresses this limitation by integrating a gradient-based repair method within the differential evolution framework. The aim of the repair method is to fix promising infeasible solutions in different subproblems using the gradient information of the constraint set.

G-DEmi
G-DEmi continuously improves a population of candidate solutions through an iterative cycle of generation, evaluation, and selection of trial vectors. In each iteration, new vectors are generated by combining existing solutions. They are evaluated based on their performance and repaired as necessary to satisfy the constraints.

Initial Population
The initial population $$\textbf{P}_g$$ is generated by taking random values. For the real variables, random real values are generated, and for the integer variables, random integer values are generated, corresponding to the solution vector $$[\textbf{x}, \textbf{y}]$$. Subsequently, the objective function $$f(\textbf{x}_g)$$ and the degree of constraint violation $$G(\textbf{x}_g)$$are evaluated.

Mutation and Crossover
For each target vector $$\textbf{x}_{g,i}$$, a trial vector $$\textbf{u}_{g,i}$$ is generated using mutation and binomial crossover ($$rand/1/bin$$). The integer variables in $$\textbf{u}_{g,i}$$ are rounded before evaluating the vector in the objective function and constraints.

Evaluation and Selection
The trial vector is compared with its corresponding target vector, and the better one is selected according to the following feasibility rules:


 * 1) Between two infeasible solutions, the one with lower constraint violation is preferred.
 * 2) If one solution is infeasible and the other one is feasible, the feasible solution is preferred.
 * 3) Between two feasible solutions, the one with better objective function value is preferred.

However, If the trial vector fails to improve its target but still has a lower objective function value $$(f(\textbf{u}_{g,i}) < f(\textbf{x}_{g,i}))$$, and no other vector of the same subproblem has been repaired in the current generation$$(\mathbf{y}_{g,i}^\text{u} \notin \mathbf{Y})$$, this trial vector is repaired.

Reparation and Improvement
The better solution between the repaired vector and its target vector is passed to the population of the next generation $$\textbf{P}_{g+1}$$. Through these steps, G-DEmi generates a new population in each generation.

The following pseudocode illustrates the algorithm: algorithm G-DEmi Framework input: $$NP$$, $$CR$$, $$F$$, $$k_{max}$$, $$T_{min}$$, $$Eval_{max}$$ output: The best solution so far initialize the population $$P_g$$ evaluate $$f(x_g)$$ and $$G(x_g)$$ for each individual in $$P_g$$ $$Eval \gets 1$$ $$g \gets 1$$ while $$Eval < Eval_{max}$$ do $$Y \gets \emptyset$$ for each individual $$x_{g,i}$$ in $$P_g$$ do generate a trial vector $$u_{g,i}$$ round the integer variables in $$u_{g,i}$$ evaluate $$f(u_{g,i})$$ and $$G(u_{g,i})$$ if $$u_{g,i}$$ is better than $$x_{g,i}$$ then store $$u_{g,i}$$ into $$P_{g+1}$$ elseif $$f(u_{g,i}) < f(x_{g,i})$$ and $$y_{g,i}^u \notin Y$$ then repair $$u_{g,i}$$ evaluate $$f(u_{g,i})$$ and $$G(u_{g,i})$$ if $$u_{g,i}$$ is better than $$x_{g,i}$$ then store $$u_{g,i}$$ into $$P_{g+1}$$ end if $$Y = Y \cup y_{g,i}^u$$ end if update $$Eval$$ end for $$g \gets g + 1$$ end while

Gradient-based Repair Method
The gradient-based repair method is a crucial component of G-DEmi, designed to address infeasibility in trial vectors generated by differential evolution operators. This method focuses on independently exploring subproblems defined by integer variables. Specifically, to repair a vector with mixed variables $$[\textbf{x}, \textbf{y}]$$, only the real variables $$\textbf{x}$$ are modified while the integer variables $$\textbf{y}$$ remain fixed.

The method repairs only those trial vectors $$\textbf{u}$$ that satisfy two conditions: (i) they lost the tournament against their target vectors but have a better objective function value, and (ii) they belong to a subproblem $$\textbf{y}$$ where no solution has been repaired in the current generation. These two conditions aim to promote the repair of trial vectors with higher potential and ensure that each subproblem is explored independently, avoiding the repair of similar solutions multiple times.

Definition of Constraint Violation
The constraint violation $$ \mathbf{V}(\mathbf{x}) $$ is defined as a vector that contains the violation degree for each inequality $$ g $$ and equality $$ h $$ constraint in a given problem, for a particular solution vector $$ \mathbf{x} $$. Parameters $$ n $$ and $$ m $$ denote the number of inequality and equality constraints, respectively, and $$ \epsilon $$ specifies the tolerance for equality constraints. The sign function $$ \text{sgn}(h) $$ preserves the sign of the equality violation.

$$ \mathbf{V}(\mathbf{x}) = \begin{bmatrix} \max(g_1(\mathbf{x}), 0) \\ \vdots \\ \max(g_n(\mathbf{x}), 0) \\ \text{sgn}(h_1) \cdot \max(|h_1(\mathbf{x})| - \epsilon, 0) \\ \vdots \\ \text{sgn}(h_m) \cdot \max(|h_m(\mathbf{x})| - \epsilon, 0) \end{bmatrix} $$

The Gradient Matrix of Constraints
The gradient matrix of these constraints with respect to the $$ N $$components of $$ \mathbf{x} $$, denoted as $$ \nabla \mathbf{V}(\mathbf{x}) $$, is defined as:

$$ \nabla \mathbf{V}(\mathbf{x}) = \begin{bmatrix} \frac{\partial g_1(\mathbf{x})}{\partial x_1} & \frac{\partial g_1(\mathbf{x})}{\partial x_2} & \cdots & \frac{\partial g_1(\mathbf{x})}{\partial x_N} \\ \vdots & \vdots & & \vdots \\ \frac{\partial g_n(\mathbf{x})}{\partial x_1} & \frac{\partial g_n(\mathbf{x})}{\partial x_2} & \cdots & \frac{\partial g_n(\mathbf{x})}{\partial x_N} \\ \frac{\partial h_1(\mathbf{x})}{\partial x_1} & \frac{\partial h_1(\mathbf{x})}{\partial x_2} & \cdots & \frac{\partial h_1(\mathbf{x})}{\partial x_N} \\ \vdots & \vdots & & \vdots \\ \frac{\partial h_m(\mathbf{x})}{\partial x_1} & \frac{\partial h_m(\mathbf{x})}{\partial x_2} & \cdots & \frac{\partial h_m(\mathbf{x})}{\partial x_N} \end{bmatrix} $$

Finite Difference Approximation
The forward finite difference approximation provides an estimate of these derivatives, defined as:

$$ \nabla \mathbf{V}(\mathbf{x})_{i,j} \approx \frac{f_i(\mathbf{x} + \Delta x \cdot \mathbf{e}_j) - f_i(\mathbf{x})}{\Delta x} $$

where $$ \Delta x $$ represents the step size and $$ \mathbf{e}_j $$ is a unitary vector of the same dimension as $$ \mathbf{x} $$, with a value of 1 for the $$ j $$ component and 0 for the rest.

Repair Procedure
This repair method aims to transform $$ \mathbf{x} $$ into a feasible solution, which involves adjusting the elements of the vector $$ \mathbf{V}(\mathbf{x}) $$ to zero. Iteratively, a repaired vector $$ \mathbf{x}_{k+1} $$ can be obtained using Newton-Raphson's method through the following equation, which represents a linear approximation of $$ \mathbf{V}(\mathbf{x}_k) $$ in the direction of the origin:

$$ \mathbf{x}_{k+1} = \mathbf{x}_k - \nabla \mathbf{V}(\mathbf{x}_k)^{-1} \mathbf{V}(\mathbf{x}_k) $$

However, it is common that the number of variables differs from the number of constraints. In this case, the $$ \nabla \mathbf{V}(\mathbf{x}_k) $$ matrix is non-invertible and the Moore-Penrose pseudoinverse must be used

$$ \mathbf{x}_{k+1} = \mathbf{x}_k - \nabla \mathbf{V}(\mathbf{x}_k)^+ \mathbf{V}(\mathbf{x}_k) $$

Where $$ \nabla \mathbf{V}(\mathbf{x}_k)^+ $$ represents the pseudoinverse  matrix of the gradient matrix $$ \nabla \mathbf{V}(\mathbf{x}_k) $$. A computationally efficient way of finding $$ \nabla \mathbf{V}(\mathbf{x}_k)^+ $$ is by employing singular value decomposition.

Mixed Variables Repair
A mixed trial vector $$ \mathbf{u} = [\mathbf{x}_u^k, \mathbf{y}_u]^T $$ is defined, where only the $$ \mathbf{x}_u^k $$ component is updated during the iterative repair process. As a result, the constraint violation degree vector and the gradient matrix can be defined as:

$$\mathbf{V}(\mathbf{u}) = \mathbf{V}(\mathbf{x}_u^k, \mathbf{y}_u)$$

$$ \nabla \mathbf{V}(\mathbf{u}) = \frac{\partial \mathbf{V}(\mathbf{x}_u^k, \mathbf{y}_u)}{\partial \mathbf{x}_u^k} $$

The repair method follows these steps: Algorithm: Gradient-based repair method Input: $$\mathbf{u}, k_{max}, T_{min}$$ Output: $$\mathbf{u}$$ Initialize $$k = 1$$. While none of the stopping criteria is fulfilled: Calculate $$\mathbf{V}(\mathbf{x}^\text{u}_k, \mathbf{y}^\text{u})$$ Calculate $$\nabla \mathbf{V}(\mathbf{x}^\text{u}_k, \mathbf{y}^\text{u})$$ Remove zero elements of $$\mathbf{V}(\mathbf{x}^\text{u}_k, \mathbf{y}^\text{u}) $$ and their corresponding values in $$\nabla \mathbf{V}(\mathbf{x}^\text{u}_k,      \mathbf{y}^\text{u})$$. Calculate the pseudoinverse $$\nabla \mathbf{V}(\mathbf{x}^\text{u}_k,      \mathbf{y}^\text{u})^{+}$$. Calculate $$\mathbf{x}^\text{u}_{k+1}$$ Update $$\mathbf{x}^\text{u}{k} \leftarrow \mathbf{x}^\text{u}{k+1}$$. Update $$\mathbf{u} = [\mathbf{x}^\text{u}_k, \mathbf{y}^\text{u}]^\text{T}$$. Increment $$k \leftarrow k+1$$. End While Stopping criteria: $$k \geq k_{max}$$: Maximum number of iterations reached. $$\mathbf{V} = 0$$: All elements of $$\mathbf{V}$$ are equal to zero. $$T_u \leq T_{min}$$: Maximum absolute difference between $$\mathbf{x}^\text{u} {k+1}$$ and $$\mathbf{x}^\text{u}{k}$$ is equal to or lower than $$T_{min} $$.

Example of Mixed Variables Repair
This repair procedure can be illustrated by the following example. Consider a scenario with one inequality constraint and one equality constraint, as shown below:

$$ g_1(\mathbf{x},y) = {x_1}^2+ {x_2}^2+ {y}^2- 12 \le 0 $$

$$ h_1(\mathbf{x},y)= x_1+x_2+ {y}- 5.5=0 $$

Suppose $$ \mathbf{u} = [2 \ 1 \ 1]^T $$ and an equality tolerance $$ \epsilon = 1 \times 10^{-04} $$. In the first iteration (where $$ k = 1 $$), $$ \mathbf{x}^\text{u}_1 = [2 \ 1]^T $$ and $$ \mathbf{y}^\text{u} = 1 $$. Therefore, the vectors $$ \mathbf{V}(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u}) $$ and $$ \nabla \mathbf{V}(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u}) $$ are computed as follows:$$ \mathbf{V}(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u}) = \begin{bmatrix} \max(-6, 0) \\ -\max(1.5 - \epsilon, 0) \end{bmatrix} = \begin{bmatrix} 0 \\ -1.4999 \end{bmatrix} $$$$ \nabla \mathbf{V}(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u}) = \begin{bmatrix} \frac{\partial g_1(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u})}{\partial x_1} & \frac{\partial g_1(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u})}{\partial x_2} \\[1ex] \frac{\partial h_1(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u})}{\partial x_1} & \frac{\partial h_1(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u})}{\partial x_2} \end{bmatrix} = \begin{bmatrix} 4 & 2 \\ 1 & 1 \end{bmatrix} $$As you can see, only $$ h_1(\mathbf{x}^\text{u}_1,\ \mathbf{y}^\text{u})$$ was violated. Therefore, the element of $$ g_1(\mathbf{x}^\text{u}_1,\ \mathbf{y}^\text{u})$$ needs to be removed from $$ \mathbf{V}(\mathbf{x}^\text{u}_1,\ \mathbf{y}^\text{u})$$ along with its corresponding values in $$ \nabla \mathbf{V}(\mathbf{x}^\text{u}_1,\ \mathbf{y}^\text{u})$$. This leads to $$ \mathbf{V}(\mathbf{x}^\text{u}_1,\ \mathbf{y}^\text{u}) = -1.4999$$. Then, $$ \nabla \mathbf{V}(\mathbf{x}^\text{u}_1,\ \mathbf{y}^\text{u})$$ and its pseudoinverse $$ \nabla \mathbf{V}(\mathbf{x}^\text{u}_1,\ \mathbf{y}^\text{u})^{+}$$ are computed as follows:$$ \nabla \mathbf{V}(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u}) = \begin{bmatrix} 1 & 1 \end{bmatrix} \Rightarrow \nabla \mathbf{V}(\mathbf{x}^\text{u}_1, \mathbf{y}^\text{u})^{+} = \begin{bmatrix} 0.5 \\ 0.5 \end{bmatrix} $$Subsequently, the vector $$\mathbf{x}^\text{u}_2$$ is obtained as follows:$$ \mathbf{x}^\text{u}_2 = \begin{bmatrix} 2 \\ 1 \end{bmatrix} - \begin{bmatrix} 0.5 \\ 0.5 \end{bmatrix} \begin{bmatrix} -1.4999 \end{bmatrix} = \begin{bmatrix} 2.75 \\ 1.75 \end{bmatrix} $$The updated vector $$\mathbf{V}(\mathbf{x}^\text{u}_2, \mathbf{y}^\text{u})$$ results in:$$ \mathbf{V}(\mathbf{x}^\text{u}_2, \mathbf{y}^\text{u}) = \begin{bmatrix} \max(-0.3750, 0) \\ \max(0-\epsilon, 0) \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} $$As you can see, the values of $$(\mathbf{x}^\text{u}_2, \mathbf{y}^\text{u})$$ satisfy all the constraints. Therefore, the trial vector has been successfully repaired, and its new values are $$\mathbf{u}=[2.75 \ 1.75 \ 1]^\text{T}$$.