User:Kinkreet/Protein Science/Protein Ligand Interactions

Protein interactions are important in protein function: protein-nucleotide interactions are required for nucleotide binding, for example to stabilize mRNA and control translation ; protein-protein interactions are required for some oligomerization, such as for the capsid of foot-and-mouth disease virus (FMDV); protein-lipid interactions are required for correct folding for membrane proteins, but the lipid can also bind to the protein, such as when non-esterified fatty acids bind to human serum albumin (HSA, which also binds hydrophobic drugs [most drugs, as their hydrophobicity allows for their better transport, delivery and integration])

In study protein-ligand interactions, we must elucidate the binding partner, mechanism of interaction (van der Waals, hydrogen bonding, ionic interactions, salt bridges, hydrophobic interactions, π-stacking interactions etc.) and factors affecting binding (ligand/protein concentration, temperature, binding affinity, specificity, complementarity of binding surfaces, pH, salt concentration, state of the ligand/protein (post-translational modification).

Hydrogen bonds
The strength of an isolated hydrogen bond is ≈ 5-6 kcal/mol, whereas those that are part of a protein and in solution are ≈ 0.5-1.5 kcal/mol
 * Effective distance ≈ 2.6-3.6Å (typically 2.7-3.1Å)
 * 2 electronegative atoms competing for the electron density of one hydrogen atom
 * Often do not see the hydrogens in structural determinations because the electron cloud density around the hydrogen is too thin, thus any signals will be weak
 * Directional
 * The Gibbs free energy of interactions depends on the solvent, which also affects how favorable the interaction.
 * When dissolved in water, every hydrogen bond that is broken is 'reformed' using the water molecules, so the overall enthalpy change of solution is around zero.
 * The entropic factor is usually unfavourable, and is offset by the enthalpic gain

Ionic interactions (inc. Salt bridges)

 * Strength: ΔH ≈ 8-20 kJ/mol
 * Especially with side groups of lysine, arginine, histidine, aspartatic acid, glutamic acids; and the α-amino and α-carboxyl groups
 * The strength depends on the pH and the pKa(s) of the side chains
 * Stronger if away from solvent, which will interact with the charges and weaken the ionic interaction
 * Can associate with metal ions, salts, charged amino acids/groups etc.
 * Typical distance of interaction is about 2.4Å

Van der Waals interactions

 * Strength: ΔH ≤ 8-20 kJ/mol
 * Between electrically-neutral atoms
 * Effective distance is very short
 * Depends on the fluctuation of electron density in the orbitals of one atom, which will affect adjacent atoms

π-stacking interactions

 * Strength: ΔH ≈ 4-8 kJ/mol
 * The π-electron density of an aromatic moiety exists above and below the plane of the ring, and so the edges will become positively charged while the centre above and below the ring will be negatively charged. The interaction is favorable whenever the positive section is in close proximity with the negative section of another ring; this can happen when the rings interact face-to-face (but offset, so the positive areas are superimposed on the negative electron-dense area, or edge-to-face.
 * Commonly found in nucleic acid-binding protein
 * Generally not specific, as any aromatic ring can stack
 * Prominent in the RNA binding domain of the human Puf protein, Pumilio1, where each base of the RNA (for 8 bases) interacts with three different types of aromatic amino side chains of the Pumilio1 domain. The binding domain is specific for a certain sequence of amino acids, specific not because of the π-stacking interactions, but because of specific hydrogen bonds.

Hydrophobic interactions
Hydrophobic interactions are different to the electrostatic interactions because it does not depend on charges between two atoms, but the mass effect of electrostatic interactions of many molecules, most probably the solvent molecules.

The hydrophobic interactions are entropically-driven, meaning that the unfavourable entropic component dominates the enthalpic component. When a hydrophobic entity (one with roughly evenly distributed electron density) is placed in a polar solvent, the solvents cannot interact strongly with the entity, because it has no polar regions for the charges of the solvent to associate with. Therefore, the solvent will interact attractive with each other (forming an ordered cage-like structure), and this attraction of the solvent pushes the hydrophobic entities away, until they aggregate together. Overall, the hydrophobic entities reduce the number of stabilizing hydrogen bonds that can be formed, and thus is enthalpically-unfavourable; it also makes the water molecules more ordered, and so is entropically-unfavourable also. Because the solvent only form ordered structures around hydrophobic surfaces, the most favourable situation is where the hydrophobic surfaces are minimal, this occurs when the hydrophobic surfaces aggregate together. The hydrophobic interaction is not caused by the high-affinity interactions between the hydrophobic molecules, but due to the attraction of the polar solvent molecules for each other.

Elucidation of protein-ligand binding
In determining the nature of any protein-ligand binding interaction, we first begin with a model or hypothesis, often using the structure of a functionally-similar protein (such as myoglobin for haemoglobin). Next, the binding affinity is measured quantitatively at different conditions to determine the nature of binding (single-site, multiple-site [independent or co-operative])

Single-site binding
$$M+L \rightleftharpoons ML$$ where $$[M]$$ is the concentration of free protein, $$[L]$$ is the concentration of free ligand, and $$[ML]$$ is the concentration of the protein-ligand complex. The equation can be broken down into two components (association and dissociation): $$M+L \rightarrow ML$$ (Association) $$M+L \leftarrow ML$$ (Dissociation)

Provided that the bindings sites are not saturated, the rate of association depends largely on the concentration of the ligands and proteins; and provided the concentration of proteins and ligands in solution is not saturated, the rate of dissociation depends largely on the concentration of the complex. It can be formulated as:

$$Rate of association = k_{on}[M][L]$$ $$Rate of dissociation = k_{off}[ML]$$

Overview
Adsorption and desorption happens at an equilibrium described by the Langmuir isotherm, which describes adsorption in relation to gas pressure (for gases) and concentration (for liquids), at a fixed temperature. It assumes that all sites are equivalent, are spread out as a monolayer, and each binding event is independent. However, because the Langmuir isotherm was originally developed for adsorption of gases onto a solid surface, it has shown a poorer fit when applied to solution systems. Some solutes have a higher affinity to pre-adsorbed surfaces, where water molecules are already pre-adsorbed onto it. Sohn and Kim proposed a modification to the isotherm by introducing a concentration dependent factor, which assumes that the concentration of the solute affects adsorption and desorption.

Derivation
At equilibrium, a protein, X, adsorbs and desorbs onto the adsorbant, M, according to$$:

The association constant, $$K_A$$ is defined as: $$\theta$$ can be defined as the ratio between occupied and total number of adsorption sites. As the concentration of adsorbed proteins is proportional to the number of occupied sites, The number of free adsorption sites is equal to $$1-\theta$$: Combining $$, $$ and $$, we get: To make this into an equality, a proportionality constant, $$\tau$$, is introduced: $$ can be expanded and rearranged to give the general form of the Langmuir isoform: where $$K_D=\frac{1}{K_A}$$ Because the concentration of proteins, $$[X]$$, is difficult to measure, we can make two small assumption in order to get exchange $$[X]$$ for a known term. We also do not want to know the fraction of the surface bound, given by $$\theta$$, but the fraction of protein bound, $$\alpha$$.

$$\theta$$, by definition:

where $$q$$ is the concentration of binding sites occupied, and may be different from $$[XM]$$ if there are multiple binding sites on the same protein. $$\Gamma$$ is the total concentration of binding sites.

The first assumption is to assume that the proportionality constant, $$\tau$$ is insignificant. Therefore, $$\tau [X] = [X]$$. $$[X]$$ can then be defined as $$[X]=P_T-q$$, where $$P_T$$ is the total concentration of protein (Free = Total -Bound).

Using the assumption and combining $$ and $$, we get:

By definition: or

Combining $$ and $$:

After expansion and elimination of one $$P_T$$ term, we are left with:

If we assume that the concentration of binding site far exceeds the total number of proteins, i.e.$$\Gamma \ggg P_T$$, or that there is a large excess of binding capacity. Then, terms with$$P_T$$ can be omitted as insignificant. And $$ simplifies to:

Scatchard Plot
A plot of the Langmuir isotherm equation will show a logarithmic line, it is often hard to identify parameters on these plots. So a Scatchard plot can be used to give plots that has a straight line.

First, the Langmuir isotherm is rearranged to: $$\frac{Y}{[L]}=-\frac{Y}{K_D}+\frac{1}{K_D}$$

When this is plotted Y/[L] vs Y, we get a straight line plot. The gradient represents -1/KD, the y-intercept represents 1/KD and the x-intercept represents the number of binding sites.

Multiple independent binding sites for the same ligand
$$\nu=\frac {n [L]}{K_D + [L]}$$

However, the Scatchard plot is not linear, because the binding of one binding sites affects the other; or that one binding site is stronger than others etc. This effect can be positive (enhances the binding of the second) or negative (reduces secondary binding)

Positive cooperativity is represented on the Scatchard plot as a negative parabola. However, for more quantitative analysis, a Hill plot is used.

Hill plot
The Hill plot (named after Archibald Hill) makes the assumption that the sites are infinitely positively cooperative, so that once a ligand bind, all the sites will be quickly bound. So the reaction for a protein with multiple binding site can be simplied to M + nL → MLn

Thus the macroscopic equilibrium association constant, Kn, can be equated to: $$K_n=\frac{[ML_n]}{[M][L]^n}$$, where Kn is equivalent to the product of each individual KA of all reactions. Thus, the % of ligand bound (in relation to the number of sites), can be represented by: $$\nu=\frac{nK_n[L]^n}{1+K_n[L]^n}$$

The assumption that Hill makes is obviously not realistic. So the equation can be modified to $$\nu=\frac{nK_{n_h}[L]^{n_h}}{1+K_{n_h}[L]^{n_h}}$$, where nh is known as the Hill coefficient. When nh = 1, it means there is no cooperativity, when nh > 1 it means positive cooperativity, nh < 1 means negative cooperativity. If nh = 0 it denotes infinite negative cooperativity, and nh = n means infinite cooperativity, as assumed.

When [L] is low, even if there is cooperative binding, there will not be enough ligands to bind to show an effect, and when [L] is too high, ligands will bind anyways even if there is negative cooperativity.

Haemoglobin and Phosphofructokinase-1 are two proteins which shows cooperativity.

Oxygen is not very soluble in the bloodstream, and so it must be carried to tissues by red blood cells containing high levels of haemoglobin. Haemoglobin is a α2β2 tetramer, each subunit contains a haem prosthetic group which contains an iron atom, to which oxygen binds to reversibly.

A homolog of haemoglobin is myoglobin, found in the muscles and have a higher affinity for oxygen than in haemoglobin. This means as oxygen is carried from the lungs using haemoglobin, the oxygen it carries can be taken up my myoglobin in the tissues, thus passing oxygen from the lungs to the tissues.

Co-operativity is explained using the Perutz mechanism. O2 binding alters the structure of the protein. 2,3-bisphosphoglycerate. (Voet 323-355)

Yeast two-hybrid method
A protein with two or more domains (usually a gene activation protein) are separated at the genetic level into two, and one is fused with a bait protein, while the other fuses with the prey protein. Here, we are testing the binding between bait and prey, using the gene activation protein as an indicator.

The fusion gene encoding the fusion peptides are inserted into a plasmid and transformed into the cell. If the bait and the prey binds, then the gene activation protein is intact and can function. We can detect the transcription of the gene by observer its gene product, which can have a colour, fluorescence, antibiotic-resistance or some other sort of screenable/selectable marker.

A selection of preys can be prepared for one bait, and vice versa; thus it is scalable if you want to test many potential binding partners to a single molecule.

This method is slow, complicated, and provide little quantification for affinity - all it tells us is whether the two proteins bound or not. It is also not very accurate, as weak interactions will show up as positive. Furthermore, binding does not necessarily mean the re-united protein will function, adding a large protein complex between the domains may alter the dimensions and structure of the protein to make it non-functional; because of this YSH is only useful for small moieties.

Pull-down assays
In pull-down assays, the bait is genetically fused to a tag via a linker region, by recombination. The tag will bind to a column; a common tag is GST (glutathione S-transferase). A cocktail of proteins can then be fed into the column, any proteins that binds to the bait will be retained in the column. The column can then be eluted and the eluants purified. SDS-PAGE can be carried out to test for purity and also to aid in identifying the protein.

This technique offers little quantification.

Gel shift assays
Gel-shift assays are used to measure protein-nucleic acid interactions; usually the nucleic acid is radiolabelled. Radiolabelling ensures the same chemical properties; an alternative to radiolabelling is end-labelling using kinases. The protein and nucleic acid of interest is ran on a native gel with either acrylamide or agarose. The nucleic acids will migrate down due to their intrinsic negative charge, any proteins bound to nucleic acids will also migrate down by the same principles, but will be slowed down due to their increased mass.

If the concentration of nucleic acid is kept constant while the protein concentration is increased, we can run each concentration at different lanes to estimate the stoichiometry of the binding (given we know the concentration of nucleic acid and proteins). After all the nucleic acids have been bound, adding extra proteins will cause the proteins to aggregate, and cause a large smear on the gel. Therefore, we can use this to estimate the rough stoichiometry, but this is not rigorous and should only be used as a rough guide. Parameters such as affinity can also not be measured absolutely; but it does serve a good relative indication if compared to other gel shift assays.

Gel shift assays only work for strong interactions, as in weaker interactions, the nucleic acid might dissociate and reassociate periodically, and will show a smear instead of clear bands.

Equilibrium dialysis
A compartment is split into two using a semi-permeable membrane, allowing small molecules but not larger molecules. Protein is added to one side of the compartment; it is unable to diffuse across to the compartment due to its size. Ligand is added to the other side of the compartment; it is able to diffuse across due to it being much smaller. An equilibrium would establish between the two compartments. When equilibrium is reached, the compartment with the protein will have a concentration equal to the number of free ligand in that compartment, plus the concentration of bound ligands. In the other compartment, the concentration of ligand is equal to the concentration of free ligand. Because the free ligand concentration should be the same in both compartments.

$$ [L_1]=\frac{[L_f]}{2} + [L_b] [L_2] = \frac{[L_f]}{2} [L_b]=[L_1] - [L_2]$$

If we assume that the protein binds only one ligand, then:

$$Y=\frac{[P_b]}{[P_0]}=\frac{[L_1]-[L_2]}{[P_0]}$$

This can be modified to: $$Y=\frac{[P_b]}{n[P_0]}=\frac{[L_1]-[L_2]}{n[P_0]}$$ for n bidning sites per protein.

[P0] is known and [L1] and [L2] can be measured, thus it can determine the fraction of proteins bound.

Equilibrium dialysis is simple and long-established. It has an advantage over Y2H and gel shift assays because it is an equilibrium measurement and thus can determine KD. The downside is that it is slow, and radiolabelled ligands are often required for quantification, these are not always available and expensive to make. It is not suitable for hydrophobic molecules, as these may stick to the membrane.

FRET
Fluorescence is the absorption of photons at one wavelength, and emission of photons of another wavelength. Fluorescence may be an intrinsic property of the protein (e.g. if it contains Trp), or may be attached on to a protein. Fluorescence of fluorophores are sensitive to the environment; the fluorophore can dissipate the energy of the excited state after absorption by collision with the solvent molecules, an effect known as quenching.

FRET, fluorescence resonance energy transfer, is used to detect whether two fluorescent molecules are close together. The donor fluorophore will absorb photons at one wavelength, and emit photons at a higher wavelength; the acceptor fluorophore's absorption spectrum overlaps with the donor fluorophore's emission spectrum. Therefore, if the acceptor fluorophore is in close proximity (10-60Å) to the donor fluorophore, then it will absorb much of the photons emitted by the donor, and emits even lower wavelength photons. However, if they are not in close proximity, the donor will dissipate its energy through emission of a photon, which will be of lower wavelength to the one that would be emitted by the acceptor. By observing the intensity of light from each wavelengths, we can quantify the level of interaction.

FRET is useful as it can exam interactions in living cells. However, false negatives can occur when the two fluorescent proteins interact, but the fluorophores are too far apart to transfer the energy. If the fluorophore is added on as a fusion protein, we must also assume that it does not interfere with the interaction of the parent protein.

Isothermal titration calorimetry
ITC is an equilibrium measurement, and so can give quantitative parameters. It uses native proteins and do not require any radioactive or fluorescent labels; the proteins can often be recovered after. However, it requires that the binding event have a measurable enthalpy change (both positive or negative).

When two proteins bind, the enthalpy change will mean a release or absorption of heat, this heat change is observed everytime a ligand is injected into a solution of binding partners. The heat change is related to the stoichiometry of binding, as well as affinity of the interaction.

See handout.

Surface plasmon resonance
See handout