Determination of equilibrium constants

Equilibrium constants are determined in order to quantify chemical equilibria. When an equilibrium constant $K$ is expressed as a concentration quotient,
 * $$K=\frac{\mathrm{[S]} ^\sigma \mathrm{[T]}^\tau \cdots } {\mathrm{[A]}^\alpha \mathrm{[B]}^\beta \cdots }$$

it is implied that the activity quotient is constant. For this assumption to be valid, equilibrium constants must be determined in a medium of relatively high ionic strength. Where this is not possible, consideration should be given to possible activity variation. The equilibrium expression above is a function of the concentrations [A], [B] etc. of the chemical species in equilibrium. The equilibrium constant value can be determined if any one of these concentrations can be measured. The general procedure is that the concentration in question is measured for a series of solutions with known analytical concentrations of the reactants. Typically, a titration is performed with one or more reactants in the titration vessel and one or more reactants in the burette. Knowing the analytical concentrations of reactants initially in the reaction vessel and in the burette, all analytical concentrations can be derived as a function of the volume (or mass) of titrant added.

The equilibrium constants may be derived by best-fitting of the experimental data with a chemical model of the equilibrium system.

Experimental methods
There are four main experimental methods. For less commonly used methods, see Rossotti and Rossotti. In all cases the range can be extended by using the competition method. An example of the application of this method can be found in palladium(II) cyanide.

Potentiometric measurements
A free concentration [A] or activity {A} of a species A is measured by means of an ion selective electrode such as the glass electrode. If the electrode is calibrated using activity standards it is assumed that the Nernst equation applies in the form
 * $$ E=E^0+\frac{RT}{nF}\ln\mathrm{\{A\}}$$

where $E^{0}$ is the standard electrode potential. When buffer solutions of known pH are used for calibration the meter reading will be a pH.
 * $$\mathrm{pH}=\frac{nF}{RT}\left(E^0-E\right)$$

At 298 K, 1 pH unit is approximately equal to 59 mV.

When the electrode is calibrated with solutions of known concentration, by means of a strong acid–strong base titration, for example, a modified Nernst equation is assumed.
 * $$E=E^0 + s\log_{10}\mathrm{[A]}$$

where $s$ is an empirical slope factor. A solution of known hydrogen ion concentration may be prepared by standardization of a strong acid against borax. Constant-boiling hydrochloric acid may also be used as a primary standard for hydrogen ion concentration.

Range and limitations
The most widely used electrode is the glass electrode, which is selective for the hydrogen ion. This is suitable for all acid–base equilibria. $log_{10} β$ values between about 2 and 11 can be measured directly by potentiometric titration using a glass electrode. This enormous range of stability constant values (ca. 100 to 1011 ) is possible because of the logarithmic response of the electrode. The limitations arise because the Nernst equation breaks down at very low or very high pH.

When a glass electrode is used to obtain the measurements on which the calculated equilibrium constants depend, the precision of the calculated parameters is limited by secondary effects such as variation of liquid junction potentials in the electrode. In practice it is virtually impossible to obtain a precision for log β better than ±0.001.

Absorbance
It is assumed that the Beer–Lambert law applies.


 * $$A=l \sum {\varepsilon c}$$

where $l$ is the optical path length, $ε$ is a molar absorbance at unit path length and $c$ is a concentration. More than one of the species may contribute to the absorbance. In principle absorbance may be measured at one wavelength only, but in present-day practice it is common to record complete spectra.

Range and limitations
An upper limit on $log_{10} β$ of 4 is usually quoted, corresponding to the precision of the measurements, but it also depends on how intense the effect is. Spectra of contributing species should be clearly distinct from each other

Fluorescence (luminescence) intensity
It is assumed that the scattered light intensity is a linear function of species’ concentrations.


 * $$I=\sum \varphi c $$

where $φ$ is a proportionality constant.

Range and limitations
The magnitude of the constant $φ$ may be higher than the value of the molar extinction coefficient, ε, for a species. When this is so, the detection limit for that species will be lower. At high solute concentrations, fluorescence intensity becomes non-linear with respect to concentration due to self-absorption of the scattered radiation.

NMR chemical shift measurements
Chemical exchange is assumed to be rapid on the NMR time-scale. An individual chemical shift $\overbar{δ}$ is the mole-fraction-weighted average of the shifts $δ$ of nuclei in contributing species.
 * $$\bar {\delta} =\frac{\sum x_i \delta_i}{\sum x_i}$$

Example: the pKa of the hydroxyl group in citric acid has been determined from 13C chemical shift data to be 14.4. Neither potentiometry nor ultraviolet–visible spectroscopy could be used for this determination.

Range and limitations
Limited precision of chemical shift measurements also puts an upper limit of about 4 on $log_{10} β$. Limited to diamagnetic systems. 1H NMR cannot be used with solutions of compounds in 1H2O.

Calorimetric measurements
Simultaneous measurement of $K$ and $ΔH$ for 1:1 adducts is routinely carried out using isothermal titration calorimetry. Extension to more complex systems is limited by the availability of suitable software.

Range and limitations
Insufficient evidence is currently available.

The competition method
The competition method may be used when a stability constant value is too large to be determined by a direct method. It was first used by Schwarzenbach in the determination of the stability constants of complexes of EDTA with metal ions.

For simplicity consider the determination of the stability constant $$K_{AB}$$ of a binary complex, AB, of a reagent A with another reagent B.
 * $$K_{AB}=\frac{[AB]}{[A][B]}$$

where the [X] represents the concentration, at equilibrium, of a species X in a solution of given composition.

A ligand C is chosen which forms a weaker complex with A The stability constant, KAC, is small enough to be determined by a direct method. For example, in the case of EDTA complexes A is a metal ion  and C may be a polyamine such as diethylenetriamine.
 * $$K_{AC}=\frac{[AC]}{[A][C]}$$

The stability constant, K for the competition reaction
 * $$AC + B \leftrightharpoons AB +C$$

can be expressed as
 * $$K=\frac{[AB][C]}{[AC][B]}$$

It follows that
 * $$K_{AB}=K \times K_{AC}$$

where K is the stability constant for the competition reaction. Thus, the value of the stability constant $$K_{AB}$$ may be derived from the experimentally determined values of K and $$K_{AC}$$.

Computational methods
It is assumed that the collected experimental data comprise a set of data points. At each $i$th data point, the analytical concentrations of the reactants, $T_{A}(i)$, $T_{B}(i)$ etc. are known along with a measured quantity, $y_{i}$, that depends on one or more of these analytical concentrations. A general computational procedure has four main components: The value of the equilibrium constant for the formation of a 1:1 complex, such as a host-guest species, may be calculated with a dedicated spreadsheet application, Bindfit: In this case step 2 can be performed with a non-iterative procedure and the pre-programmed routine Solver can be used for step 3.
 * 1) Definition of a chemical model of the equilibria
 * 2) Calculation of the concentrations of all the chemical species in each solution
 * 3) Refinement of the equilibrium constants
 * 4) Model selection

The chemical model
The chemical model consists of a set of chemical species present in solution, both the reactants added to the reaction mixture and the complex species formed from them. Denoting the reactants by A, B..., each complex species is specified by the stoichiometric coefficients that relate the particular combination of reactants forming them.

When using general-purpose computer programs, it is usual to use cumulative association constants, as shown above. Electrical charges are not shown in general expressions such as this and are often omitted from specific expressions, for simplicity of notation. In fact, electrical charges have no bearing on the equilibrium processes other that there being a requirement for overall electrical neutrality in all systems.

With aqueous solutions the concentrations of proton (hydronium ion) and hydroxide ion are constrained by the self-dissociation of water.
 * $$K_\mathrm{W}^' = \frac{[H^+][OH^-]}{[H_2O]} $$

With dilute solutions the concentration of water is assumed constant, so the equilibrium expression is written in the form of the ionic product of water.



When both H+ and OH− must be considered as reactants, one of them is eliminated from the model by specifying that its concentration be derived from the concentration of the other. Usually the concentration of the hydroxide ion is given by



In this case the equilibrium constant for the formation of hydroxide has the stoichiometric coefficients −1 in regard to the proton and zero for the other reactants. This has important implications for all protonation equilibria in aqueous solution and for hydrolysis constants in particular.

It is quite usual to omit from the model those species whose concentrations are considered negligible. For example, it is usually assumed then there is no interaction between the reactants and/or complexes and the electrolyte used to maintain constant ionic strength or the buffer used to maintain constant pH. These assumptions may or may not be justified. Also, it is implicitly assumed that there are no other complex species present. When complexes are wrongly ignored a systematic error is introduced into the calculations.

Equilibrium constant values are usually estimated initially by reference to data sources.

Speciation calculations
A speciation calculation is one in which concentrations of all the species in an equilibrium system are calculated, knowing the analytical concentrations, TA, TB etc. of the reactants A, B etc. This means solving a set of nonlinear equations of mass-balance



for the free concentrations [A], [B] etc. When the pH (or equivalent e.m.f., E).is measured, the free concentration of hydrogen ions, [H], is obtained from the measured value as "$[\mathrm H]=10^{-\mathrm{pH}}$ or $[\mathrm H]=e^\mathrm{{ -\frac{nF}{RT}}(E-E^0) }$|undefined"and only the free concentrations of the other reactants are calculated. The concentrations of the complexes are derived from the free concentrations via the chemical model.

Some authors include the free reactant terms in the sums by declaring identity (unit) $β$ constants for which the stoichiometric coefficients are 1 for the reactant concerned and zero for all other reactants. For example, with 2 reagents, the mass-balance equations assume the simpler form.



In this manner, all chemical species, including the free reactants, are treated in the same way, having been formed from the combination of reactants that is specified by the stoichiometric coefficients.

In a titration system the analytical concentrations of the reactants at each titration point are obtained from the initial conditions, the burette concentrations and volumes. The analytical (total) concentration of a reactant R at the $i$th titration point is given by

where R0 is the initial amount of R in the titration vessel, $v_{0}$ is the initial volume, [R] is the concentration of R in the burette and $v_{i}$ is the volume added. The burette concentration of a reactant not present in the burette is taken to be zero.

In general, solving these nonlinear equations presents a formidable challenge because of the huge range over which the free concentrations may vary. At the beginning, values for the free concentrations must be estimated. Then, these values are refined, usually by means of Newton–Raphson iterations. The logarithms of the free concentrations may be refined rather than the free concentrations themselves. Refinement of the logarithms of the free concentrations has the added advantage of automatically imposing a non-negativity constraint on the free concentrations. Once the free reactant concentrations have been calculated, the concentrations of the complexes are derived from them and the equilibrium constants.

Note that the free reactant concentrations can be regarded as implicit parameters in the equilibrium constant refinement process. In that context the values of the free concentrations are constrained by forcing the conditions of mass-balance to apply at all stages of the process.

Equilibrium constant refinement
The objective of the refinement process is to find equilibrium constant values that give the best fit to the experimental data. This is usually achieved by minimising an objective function, $U$, by the method of non-linear least-squares. First the residuals are defined as



Then the most general objective function is given by



The matrix of weights, $W$, should be, ideally, the inverse of the variance-covariance matrix of the observations. It is rare for this to be known. However, when it is, the expectation value of U is one, which means that the data are fitted within experimental error. Most often only the diagonal elements are known, in which case the objective function simplifies to



with $W_{ij} = 0$ when $j ≠ i$. Unit weights, $W_{ii} = 1$, are often used but, in that case, the expectation value of $U$ is the root mean square of the experimental errors.

The minimization may be performed using the Gauss–Newton method. Firstly the objective function is linearised by approximating it as a first-order Taylor series expansion about an initial parameter set, $p$.

The increments $δp_{i}$ are added to the corresponding initial parameters such that $U$ is less than $U^{0}$. At the minimum the derivatives $∂U⁄∂p_{i}$, which are simply related to the elements of the Jacobian matrix, $J$


 * $$J_{jk}=\frac{\partial y_j^\mathrm{calc}}{\partial p_k}$$

where $p_{k}$ is the $k$th parameter of the refinement, are equal to zero. One or more equilibrium constants may be parameters of the refinement. However, the measured quantities (see above) represented by $y$ are not expressed in terms of the equilibrium constants, but in terms of the species concentrations, which are implicit functions of these parameters. Therefore, the Jacobian elements must be obtained using implicit differentiation.

The parameter increments $δp$ are calculated by solving the normal equations, derived from the conditions that $∂U⁄∂p = 0$ at the minimum.

The increments $δp$ are added iteratively to the parameters

where $n$ is an iteration number. The species concentrations and $y^{calc}$ values are recalculated at every data point. The iterations are continued until no significant reduction in $U$ is achieved, that is, until a convergence criterion is satisfied. If, however, the updated parameters do not result in a decrease of the objective function, that is, if divergence occurs, the increment calculation must be modified. The simplest modification is to use a fraction, $f$, of calculated increment, so-called shift-cutting.

In this case, the direction of the shift vector, $δp$, is unchanged. With the more powerful Levenberg–Marquardt algorithm, on the other hand, the shift vector is rotated towards the direction of steepest descent, by modifying the normal equations,

where $λ$ is the Marquardt parameter and $I$ is an identity matrix. Other methods of handling divergence have been proposed.

A particular issue arises with NMR and spectrophotometric data. For the latter, the observed quantity is absorbance, $A$, and the Beer–Lambert law can be written as

It can be seen that, assuming that the concentrations, c, are known, that absorbance, $A$, at a given wavelength, \lambda, and path length l, is a linear function of the molar absorbptivities, $ε$. With 1 cm path-length, in matrix notation



There are two approaches to the calculation of the unknown molar absorptivities
 * (1) The $ε$ values are considered parameters of the minimization and the Jacobian is constructed on that basis. However, the $ε$ values themselves are calculated at each step of the refinement by linear least-squares:
 * using the refined values of the equilibrium constants to obtain the speciation. The matrix
 * is an example of a pseudo-inverse.
 * Golub and Pereyra showed how the pseudo-inverse can be differentiated so that parameter increments for both molar absorptivities and equilibrium constants can be calculated by solving the normal equations.
 * (2) The Beer–Lambert law is written as
 * The unknown molar absorbances of all "coloured" species are found by using the non-iterative method of linear least-squares, one wavelength at a time. The calculations are performed once every refinement cycle, using the stability constant values obtaining at that refinement cycle to calculate species' concentration values in the matrix \mathbf{C}.
 * (2) The Beer–Lambert law is written as
 * The unknown molar absorbances of all "coloured" species are found by using the non-iterative method of linear least-squares, one wavelength at a time. The calculations are performed once every refinement cycle, using the stability constant values obtaining at that refinement cycle to calculate species' concentration values in the matrix \mathbf{C}.
 * The unknown molar absorbances of all "coloured" species are found by using the non-iterative method of linear least-squares, one wavelength at a time. The calculations are performed once every refinement cycle, using the stability constant values obtaining at that refinement cycle to calculate species' concentration values in the matrix \mathbf{C}.

Parameter errors and correlation
In the region close to the minimum of the objective function, $U$, the system approximates to a linear least-squares system, for which

Therefore, the parameter values are (approximately) linear combinations of the observed data values and the errors on the parameters, $p$, can be obtained by error propagation from the observations, $y^{obs}$, using the linear formula. Let the variance-covariance matrix for the observations be denoted by $Σ^{y}$ and that of the parameters by $Σ^{p}$. Then,

When $W = (Σ^{y})^{−1}$, this simplifies to

In most cases the errors on the observations are un-correlated, so that $Σ^{y}$ is diagonal. If so, each weight should be the reciprocal of the variance of the corresponding observation. For example, in a potentiometric titration, the weight at a titration point, $k$, can be given by

where $σ_{E}$ is the error in electrode potential or pH, $( ∂E⁄∂v )   k$ is the slope of the titration curve and $σ_{v}$ is the error on added volume.

When unit weights are used ($W = I$, $p = (J^{T}J)^{−1}J^{T}y$) it is implied that the experimental errors are uncorrelated and all equal: $Σ^{y} = σ^{2}I$, where $σ^{2}$ is known as the variance of an observation of unit weight, and $I$ is an identity matrix. In this case $σ^{2}$ is approximated by

where $U$ is the minimum value of the objective function and $n_{d}$ and $n_{p}$ are the number of data and parameters, respectively.

In all cases, the variance of the parameter $p_{i}$ is given by $Σp ii$ and the covariance between parameters $p_{i}$ and $p_{j}$ is given by $Σp ij$. Standard deviation is the square root of variance. These error estimates reflect only random errors in the measurements. The true uncertainty in the parameters is larger due to the presence of systematic errors—which, by definition, cannot be quantified.

Note that even though the observations may be uncorrelated, the parameters are always correlated.

Derived constants
When cumulative constants have been refined it is often useful to derive stepwise constants from them. The general procedure is to write down the defining expressions for all the constants involved and then to equate concentrations. For example, suppose that one wishes to derive the pKa for removing one proton from a tribasic acid, LH3, such as citric acid. The stepwise association constant for formation of LH3 is given by

Substitute the expressions for the concentrations of LH3 and into this equation

whence
 * $$\beta_{13}=K\beta_{12}; K=\frac{\beta_{13}}{\beta_{12}} \,$$

and since $pK_{a} = −log_{10} 1⁄K$ its value is given by

Note the reverse numbering for pK and log β. When calculating the error on the stepwise constant, the fact that the cumulative constants are correlated must accounted for. By error propagation
 * $$\sigma^2_K=\sigma^2_{\beta_{12}}+\sigma^2_{\beta_{13}}-2 \sigma_{\beta_{12}} \sigma_{\beta_{13}}\rho_{12,13}\,$$

and
 * $$\sigma_{\log_{10} K}=\frac{\sigma_K}{K}$$

Model selection
Once a refinement has been completed the results should be checked to verify that the chosen model is acceptable. generally speaking, a model is acceptable when the data are fitted within experimental error, but there is no single criterion to use to make the judgement. The following should be considered.

The objective function
When the weights have been correctly derived from estimates of experimental error, the expectation value of $U⁄n_{d} − n_{p}$ is 1. It is therefore very useful to estimate experimental errors and derive some reasonable weights from them as this is an absolute indicator of the goodness of fit.

When unit weights are used, it is implied that all observations have the same variance. $U⁄n_{d} − n_{p}$ is expected to be equal to that variance.

Parameter errors
One would want the errors on the stability constants to be roughly commensurate with experimental error. For example, with pH titration data, if pH is measured to 2 decimal places, the errors of $log_{10} β$ should not be much larger than 0.01. In exploratory work where the nature of the species present is not known in advance, several different chemical models may be tested and compared. There will be models where the uncertainties in the best estimate of an equilibrium constant may be somewhat or even significantly larger than $σ_{pH}$, especially with those constants governing the formation of comparatively minor species, but the decision as to how large is acceptable remains subjective. The decision process as to whether or not to include comparatively uncertain equilibria in a model, and for the comparison of competing models in general, can be made objective and has been outlined by Hamilton.

Distribution of residuals
At the minimum in $U$ the system can be approximated to a linear one, the residuals in the case of unit weights are related to the observations by

The symmetric, idempotent matrix $J(J^{T}T)^{−1}J$ is known in the statistics literature as the hat matrix, $H$. Thus,

and

where $I$ is an identity matrix and $M^{r}$ and $M^{y}$ are the variance-covariance matrices of the residuals and observations, respectively. This shows that even though the observations may be uncorrelated, the residuals are always correlated.

The diagram at the right shows the result of a refinement of the stability constants of Ni(Gly)+, Ni(Gly)2 and (where GlyH = glycine). The observed values are shown a blue diamonds and the species concentrations, as a percentage of the total nickel, are superimposed. The residuals are shown in the lower box. The residuals are not distributed as randomly as would be expected. This is due to the variation of liquid junction potentials and other effects at the glass/liquid interfaces. Those effects are very slow compared to the rate at which equilibrium is established.

Physical constraints
Some physical constraints are usually incorporated in the calculations. For example, all the concentrations of free reactants and species must have positive values and association constants must have positive values.

With spectrophotometric data the calculated molar absorptivity (or emissivity) values should all be positive. Most computer programs do not impose this constraint on the calculations.

Chemical constraints
When determining the stability constants of metal-ligand complexes, it is common practice to fix ligand protonation constants at values that have been determined using data obtained from metal-free solutions. Hydrolysis constants of metal ions are usually fixed at values which were obtained using ligand-free solutions. When determining the stability constants for ternary complexes, MpAqBr it is common practice the fix the values for the corresponding binary complexes Mp′Aq′ and Mp′′Bq′′, at values which have been determined in separate experiments. Use of such constraints reduces the number of parameters to be determined, but may result in the calculated errors on refined stability constant values being under-estimated.

Other models
If the model is not acceptable, a variety of other models should be examined to find one that best fits the experimental data, within experimental error. The main difficulty is with the so-called minor species. These are species whose concentration is so low that the effect on the measured quantity is at or below the level of error in the experimental measurement. The constant for a minor species may prove impossible to determine if there is no means to increase the concentration of the species. .

Implementations
Some simple systems are amenable to spreadsheet calculations.

A large number of general-purpose computer programs for equilibrium constant calculation have been published. See for a bibliography. The most frequently used programs are:
 * Potentiometric data: Hyperquad, BEST PSEQUAD, ReactLab pH PRO
 * Spectrophotometric data:HypSpec, SQUAD, Specfit, ReactLab EQUILIBRIA
 * NMR data HypNMR, EQNMR
 * Calorimetric data HypΔH. Affinimeter Commercial Isothermal titration calorimeters are usually supplied with software with which an equilibrium constant and standard formation enthalpy for the formation of a 1:1 adduct can be obtained. Some software for handling more complex equilibria may also be supplied.