Yoshimine sort

The Yoshimine sort is an algorithm that is used in quantum chemistry to order lists of two electron repulsion integrals. It is implemented in the IBM Alchemy program suite and in the UK R-matrix package for electron and positron scattering by molecules which is based on the early versions of the IBM Alchemy program suite.

Use of basis set expansions in quantum chemistry
In quantum chemistry, it is common practice to represent one electron functions in terms of an expansion over a basis set, $$\chi_i$$. The most common choice for this basis set is Gaussian orbitals (GTOs) however for linear molecules Slater orbitals (STOs) can be used.

The Schrödinger equation, for a system with two or more electrons, includes the Coulomb repulsion operator. In the basis set expansion approach this leads to the requirement to compute two electron repulsion integrals involving four basis functions. Any given basis set may be ordered so that each function can assigned a unique index. So, for any given basis set, each two electron integral can be described by four indices, that is the indices of the four basis functions involved. It is customary to denote these indices as p,q,r and s and the integral as (pq|rs). Assuming that $$\chi_i$$ are real functions, the (pq|rs) are defined by



(pq|rs) = \int\int \frac{ \chi_p( \mathbf{r}_1 ) \; \chi_q( \mathbf{r}_1 ) \; \chi_r( \mathbf{r}_2 ) \; \chi_s( \mathbf{r}_2 ) \; }                                   {                                       \mid \mathbf{r}_1  - \mathbf{r}_2 \mid }                                    \;\; d\mathbf{r}_1 \; d\mathbf{r}_2 $$

The number of two electron integrals that must be computed for any basis set depends on the number of functions in the basis set and on the symmetry point group of the molecule being studied.

Permutational symmetry of the indices
The computed two electron integrals are real numbers, $$(pq|rs)\in\mathbb{R}$$, and this implies certain permutational symmetry properties on the indices p,q,r and s. The exact details depend on whether the part of the basis function representing angular behavior is real or complex. For Gaussian orbitals real spherical harmonics are generally used whereas for Slater orbitals the complex spherical harmonics are used.

In the case of real orbitals, p can be swapped with q without changing the integral value, or independently r with s. in addition pq as a pair can be swapped with rs as a pair without changing the integral. Putting these interchanges together means that

\begin{matrix} (pq|rs) = & (qp|rs) \\ & (pq|sr) \\ & (qp|sr) \\ & (rs|pq) \\ & (sr|pq) \\ & (rs|qp) \\ & (sr|qp) \end{matrix} $$ which is eightfold symmetry. If the molecule has no spatial symmetry, in other words it belongs to the $$C_1$$ point group which has only one irreducible representation, then the permutational symmetry of the integrals indices is the only operation which can be applied. On the other hand, if the molecule has some symmetry operations, then further ordering is possible. The impact of the above symmetry relationship is that an integral can be computed once, but corresponds to eight different index combinations.

Point group symmetry of the system
The Schrödinger Hamiltonian commutes with the operations of the point symmetry group of the nuclear framework of the molecule. This means that a two electron integral can be non-zero only if the product of the four functions transforms, or contains a component which transforms, as the totally symmetric irreducible representation of the symmetry point group to which the molecule belongs.

This means that a computer program for two electron integral processing can precompute the list of basis function symmetry combinations (symmetry blocks) for which integrals may be non zero and ignore all other symmetry combinations. The list of symmetry blocks can also be ordered. Frequently, the totally symmetric irreducible representation is assigned the lowest index in the list, typically 1 in Fortran or 0 in the C programming language. Within any given symmetry block, the permutational symmetry of the integrals still applies and the integrals can be ordered within that block. For example if the molecule belongs to the $$C_2$$ point group, which has irreducible representations $$A$$ and $$B$$ then integral blocks for the following symmetry combinations are non-zero

\begin{matrix} ( \; A \; A \; | \; A \; A \; ) \\ ( \; B \; B \; | \; B \; B \; ) \\ ( \; A \; A \; | \; B \; B \; ) \\ ( \; A \; B \; | \; A \; B \; ) \\ \end{matrix} $$ and integrals blocks for any other symmetry combination are identically zero by group theory. Thus two types of ordering can be used:


 * the non-zero symmetry blocks of two electron integrals are ordered (the programmer is at liberty to define this order) the dimension of each block can be computed since the number of basis functions of each symmetry is known.
 * within each non-zero block, integrals are ordered according to the above symmetry of indices.

This means that given the four indices pqrs defining a two electron integral, a unique index may be computed. This is the essence of the Yoshimine procedure.

Yoshimine's sorting procedure
When the integrals are computed by the integrals program they are written out to a sequential file along with the p,q,r,s indices which define them. The order in which the integrals are computed is defined by the algorithm used in the integration program. The most efficient algorithms do not compute the integrals in order, that is such that the p,q,r and s indices are ordered.

This would not be a problem is all of the integrals could be held in CPU memory simultaneously. In that case the computed integral can be assigned into its position in the array of two electron integrals by computing the required index from the p,q,r and s indices. In the 1960s it was essentially impossible to hold all of the two electron integrals in memory simultaneously. Therefore, M Yoshimine developed a sorting algorithm for two-electron integrals which reads the unordered list of integrals from a files and transforms it into an ordered list which is then written to another file. A by-product of this is that the file storing the ordered integrals does not need to contain the p,q,r,s indices for each integral. The ordering process uses a direct access file but the input and output files of integrals are sequential.

At the start of the 21st century, computer memory is much larger and for small molecules and/or small basis sets it is sometimes possible to hold all two electron integrals in memory. In general however, the Yoshimine algorithm is still required.