Phi value analysis

Phi value analysis, $$ \phi $$ analysis, or $$\phi$$-value analysis is an experimental protein engineering technique for studying the structure of the folding transition state of small protein domains that fold in a two-state manner. The structure of the folding transition state is hard to find using methods such as protein NMR or X-ray crystallography because folding transitions states are mobile and partly unstructured by definition. In $$ \phi $$-value analysis, the folding kinetics and conformational folding stability of the wild-type protein are compared with those of point mutants to find phi values. These measure the mutant residue's energetic contribution to the folding transition state, which reveals the degree of native structure around the mutated residue in the transition state, by accounting for the relative free energies of the unfolded state, the folded state, and the transition state for the wild-type and mutant proteins.

The protein's residues are mutated one by one to identify residue clusters that are well-ordered in the folded transition state. These residues' interactions can be checked by ''double-mutant-cycle $$ \phi $$ analysis'', in which the single-site mutants' effects are compared to the double mutants'. Most mutations are conservative and replace the original residue with a smaller one (cavity-creating mutations) like alanine, though tyrosine-to-phenylalanine, isoleucine-to-valine and threonine-to-serine mutants can be used too. Chymotrypsin inhibitor, SH3 domains, WW domain, individual domains of proteins L and G, ubiquitin, and barnase have all been studied by $$ \phi $$ analysis.

Mathematical approach
Phi is defined thus:

$$ \phi = \frac{(\Delta G^{TS \rightarrow D}_{W} - \Delta G^{TS \rightarrow D}_{M})}{(\Delta G^{N \rightarrow D}_{W} - \Delta G^{N \rightarrow D}_{M})} = \frac{\Delta\Delta G^{TS \rightarrow D}}{\Delta\Delta G^{N \rightarrow D}} $$

$$\Delta G^{TS \rightarrow D}_{W}$$ is the difference in energy between the wild-type protein's transition and denatured state, $$\Delta G^{TS \rightarrow D}_{M}$$ is the same energy difference but for the mutant protein, and the $$\Delta G^{N \rightarrow D}$$ bits are the differences in energy between the native and denatured state. The phi value is interpreted as how much the mutation destabilizes the transition state versus the folded state.

Though $$ \phi $$ may have been meant to range from zero to one, negative values can appear. A value of zero suggests the mutation doesn't affect the structure of the folding pathway's rate-limiting transition state, and a value of one suggests the mutation destabilizes the transition state as much as the folded state; values near zero suggest the area around the mutation is relatively unfolded or unstructured in the transition state, and values near one suggest the transition state's local structure near the mutation site is similar to the native state's. Conservative substitutions on the protein's surface often give phi values near one. When $$ \phi $$ is well between zero and one, it is less informative as it doesn't tell us which is the case:
 * 1) The transition state itself is partly structured; or
 * 2) There are two protein populations of near-equal numbers, one kind which is mostly-unfolded and the other which is mostly-folded.

Key assumptions
\phi $$ to numbers greater than zero is the same as assuming the mutation increases the stability and lowers the energy of neither the native nor the transition state. It is in the same line assumed that interactions that stabilize a folding transition state are like those of the native structure, though some protein folding studies found that stabilizing non-native interactions in a transition state facilitates folding.
 * 1) Phi value analysis assumes Hammond's postulate, which states that energy and chemical structure are correlated. Though the relationship between the folding intermediate and native state's structures may correlate that between their energies when the energy landscape has a well-defined, deep global minimum, free energy destabilizations may not give useful structural information when the energy landscape is flatter or has many local minima.
 * 2) Phi value analysis assumes the folding pathway isn't significantly altered, though the folding energies may be. As nonconservative mutations may not bear this out, conservative substitutions, though they may give smaller energetic destabilizations which are harder to detect, are preferred.
 * 3) Restricting $$

Example: barnase
Alan Fersht pioneered phi value analysis in his study of the small bacterial protein barnase. Using molecular dynamics simulations, he found that the transition state between folding and unfolding looks like the native state and is the same no matter the reaction direction. Phi varied with the mutation location as some regions gave values near zero and others near one. The distribution of $$ \phi $$ values throughout the protein's sequence agreed with all of the simulated transition state but one helix which folded semi-independently and made native-like contacts with the rest of the protein only once the transition state had formed fully. Such variation in the folding rate in one protein makes it hard to interpret $$ \phi $$ values as the transition state structure must otherwise be compared to folding-unfolding simulations which are computationally expensive.

Variants
Other 'kinetic perturbation' techniques for studying the folding transition state have appeared recently. Best known is the psi ($$\psi$$) value which is found by engineering two metal-binding amino acid residues like histidine into a protein and then recording the folding kinetics as a function of metal ion concentration, though Fersht thought this approach difficult. A 'cross-linking' variant of the $$\phi$$-value was used to study segment association in a folding transition state as covalent crosslinks like disulfide bonds were introduced. $$\phi$$-T value analysis has been used as an extension of $$\phi$$-value analysis to measure the response of mutants as a function of temperature to separate enthalpic and entropic contributions to the transition state free energy.

Limitations
The error in equilibrium stability and aqueous (un)folding rate measurements may be large when values of $$\phi$$ for solutions with denaturants must be extrapolated to aqueous solutions that are nearly pure or the stability difference between the native and mutant protein is 'low', or less than 7 kJ/mol. This may cause $$\phi$$ to fall beyond the zero-one range. Calculated values $$\phi$$ depend strongly on how many data point are available. A study of 78 mutants of WW domain with up to four mutations per residue has quantified what types of mutations avoid interference from native state flexibility, solvation, and other effects, and statistical analysis shows that reliable information about transition state perturbation can be obtained from large mutant screens.