User:Tiezhongyu2010/sandbox

In global optimization, state transition algorithm (STA) is an iterative method that generates a sequence of improving approximate solutions for an optimization problem. Due to its intrinsic properties, STA has the ability to find a global optimal solution in probability and can guarantee an optimal solution.

State transition algorithm    was firstly proposed by Zhou et al, and it is a stochastic global optimization method and aims to find a possible global or approximate optimal solution in a reasonable amount of time. In STA, a solution to an optimization problem is regarded as a state, and an update of a solution can be regarded as a state transition. Using the state-space representation, in STA, it describes solutions updating in a unified framework, and the execution operators to update solutions are expressed as state transition matrices, which make it easy to understand and flexible to implement:
 * $$ \mathbf{x}_{k+1} = A_k \mathbf{x}_k + B_k \mathbf{u}_k$$
 * $$ \mathbf{y}_{k+1} = f(\mathbf{x}_{k+1})$$

where:
 * $$ \mathbf{x}_k $$ stands for a current state, corresponding to a solution to an optimization problem;
 * $$ \mathbf{u}_k $$ is a function of $$ \mathbf{x}_{k} $$ and historical states;
 * $$ \mathbf{y}_k $$ is the fitness value at $$ \mathbf{x}_{k} $$;
 * $$ \mathbf{A}_k, \mathbf{B}_k $$ are state transformation matrices, which can be considered as execution operators;
 * $$ f(\cdot) $$ is the objective function or evaluation function.

As a stochastic global optimization method, STA has the following properties:
 * globality, STA has the ability to search the whole space;
 * optimality, STA can guarantee to find an optimal solution;
 * convergence, the sequence generated by STA is convergent;
 * rapidity, inherent advantages existing in STA to reduce the computational complexity;
 * controllability, STA can control the search space flexibly.

Continuous state transition algorithm (CSTA)
In continuous STA, $$ \mathbf{x}_k \in \mathbb{R}^n $$ is a continuous variable, and four special state transformation operators are designed to generate new candidate solutions.

State transformation operators
(1) Rotation transformation (RT)
 * $$ \mathbf{x}_{k+1} = \mathbf{x}_k + \alpha \frac{1}{n\|\mathbf{x}_k\|_2} R_r \mathbf{x}_k $$

where $$ \alpha $$ is a positive constant, called the rotation factor, $$ R_r \in \mathbb{R}^{n \times n} $$ is a random matrix with its entries being uniformly distributed random variables defined on the interval [-1,1], and $$ \|\cdot\| $$ is the 2-norm of a vector. The rotation transformation has the functionality to search in a hypersphere with maximal radius $$ \alpha $$, that is to say, $$ \|\mathbf{x}_{k+1} - \mathbf{x}_{k}\|_2 \leq \alpha $$.

(2) Translation transformation (TT)
 * $$ \mathbf{x}_{k+1} = \mathbf{x}_k + \beta R_t \frac{\mathbf{x}_k - \mathbf{x}_{k-1}}{\|\mathbf{x}_k - \mathbf{x}_{k-1}\|_2} $$

where $$ \beta $$ is a positive constant, called the translation factor, and $$ R_t \in \mathbb{R} $$ is a uniformly distributed random variable defined on the interval [0,1]. The translation transformation has the functionality to search along a line from $$ \mathbf{x}_{k-1} $$ to $$ \mathbf{x}_k $$ at the starting point $$ \mathbf{x}_k $$ with maximal length $$ \beta $$.

(3) Expansion transformation (ET)
 * $$ \mathbf{x}_{k+1} = \mathbf{x}_k + \gamma R_e \mathbf{x}_k $$

where $$ \gamma $$ is a positive constant, called the expansion factor, and $$ R_e \in \mathbb{R}^{n \times n} $$ is a random diagonal matrix with its entries obeying the Gaussian distribution. The expansion transformation has the functionality to expand the entries in $$ \mathbf{x}_k $$ to the range of $$ [-\infty, +\infty] $$, searching in the whole space.

(4) Axesion transformation (AT)
 * $$ \mathbf{x}_{k+1} = \mathbf{x}_k + \delta R_a \mathbf{x}_k $$

where $$ \delta $$ is a positive constant, called the axesion factor, and $$ R_a \in \mathbb{R}^{n \times n} $$ is a random diagonal matrix with its entries obeying the Gaussian distribution and with only one random position having nonzero value. The axesion transformation aims to search along the axes.

Regular neighbourhood and sampling
For a given solution $$ \mathbf{x}_k $$, a candidate solution $$ \mathbf{x}_{k+1} $$ is generated by using one time of the aforementioned state transformation operators. Since the state transition matrix in each state transformation is random, the generated candidate solution is not unique. Based on a given point, it is not difficult to imagine that a regular neighbourhood will be automatically formed when using certain state transformation operators.

Since the entries in state transition matrix obey certain stochastic distribution, for any given solution, the new candidate becomes a random vector and its corresponding solution (the value of a random vector) can be regarded as a sample. Considering that any two random state transition matrices in each state transformation operator are independent, several times of state transformation (called the degree of search enforcement, $$ SE $$ for short) based on the given solution are performed for certain state transformation operator, yielding $$ SE $$ samples.

An update strategy
As mentioned above, based on the incumbent best solution, a total number of SE candidate solutions are sampled. A new best solution is selected from the candidate set by virtue of the evaluation function, denoted as $$ newBest $$. Then, an update strategy based on greedy criterion is used to update the incumbent best solution:


 * $$ \text{Best} = \text{newBest} $$, if $$ f(\text{newBest}) < f(\text{Best}), $$


 * $$ \text{Best} = \text{Best} $$, otherwise

Algorithm procedure of the basic continuous STA
With the state transformation operators, sampling technique and update strategy, the basic continuous STA can be described as follows:

Step 1: Initiate a random solution $$ Best $$ and set $$ \alpha = \alpha_{\max} = 1, \alpha_{\min} = 10^{-4}, $$ $$ \beta = 1, \gamma = 1, \delta = 1, fc = 2, k = 0; $$

Step 2: Generate $$ SE $$ samples based on incumbent $$ Best $$ using Expansion Transformation, and then update the incumbent $$ Best $$ using greedy criterion incorporating $$ SE $$ samples and incumbent $$ Best $$. Let us denote $$ newBest $$ the best solution in $$ SE $$ samples, if $$ f(newBest) < f(Best) $$, then perform the Translation Transformation similarly to update the incumbent $$ Best $$;

Step 3: Generate $$ SE $$ samples based on incumbent $$ Best $$ using Rotation Transformation, and then update the incumbent $$ Best $$ using greedy criterion incorporating $$ SE $$ samples and incumbent $$ Best $$. If $$ f(newBest) < f(Best) $$, then perform the Translation Transformation similarly to update the incumbent $$ Best $$;

Step 4: Generate $$ SE $$ samples based on incumbent $$ Best $$ using Axesion Transformation, and then update the incumbent $$ Best $$ using greedy criterion incorporating $$ SE $$ samples and incumbent $$ Best $$. If $$ f(newBest) < f(Best) $$, then perform the Translation Transformation similarly to update the incumbent $$ Best $$;

Step 5: set $$ k = k + 1 $$, if $$ \alpha < \alpha_{\min} $$, set $$ \alpha = \alpha_{\max} $$, else set  $$ \alpha = \alpha /fc $$, and return to Step 2 until the maximum of iterations is met.

Philosophy behind the continuous STA

 * The expansion transformation contributes to the globality since it has the functionality to search the whole space;
 * The rotation transformation benefits the optimality since when $$ \alpha $$ is sufficiently small, the incumbent best solution becomes a local optimal solution;
 * The update strategy based on greedy criterion contributes to the convergence, that is to say, the sequence $$ \{f(\text{Best}_k)_{k=1}^\infty \}$$ is convergent due to $$ f(\text{Best}_{k+1}) \leq f(\text{Best}_k) $$ and the monotone convergence theorem;
 * The sampling technique (it can avoid complete enumeration) and the alternate use of state transformation operators help to reduce computational complexity;
 * The parameters like $$ \alpha, \beta, \gamma, \delta $$ can be adjusted to control the search space.

Applications of STA
STA has found a variety of applications, like image segmentation, wind power prediction, energy consumption in the alumina evaporation process, resolution of overlapping linear sweep voltammetric peaks, PID controller design, feature selection,, system modeling,  and dynamic optimization and it is shown that STA is comparable to most existing global optimization methods.