Recognition sequence

A recognition sequence is a DNA sequence to which a structural motif of a DNA-binding domain exhibits binding specificity. Recognition sequences are palindromes.

The transcription factor Sp1 for example, binds the sequences 5'-(G/T)GGGCGG(G/A)(G/A)(C/T)-3', where (G/T) indicates that the domain will bind a guanine or thymine at this position.

The restriction endonuclease PstI recognizes, binds, and cleaves the sequence 5'-CTGCAG-3'.

A recognition sequence is different from a recognition site. A given recognition sequence can occur one or more times, or not at all, on a specific DNA fragment. A recognition site is specified by the position of the site. For example, there are two PstI recognition sites in the following DNA sequence fragment, starting at base 9 and 31 respectively. A recognition sequence is a specific sequence, usually very short (less than 10 bases). Depending on the degree of specificity of the protein, a DNA-binding protein can bind to more than one specific sequence. For PstI, which has a single sequence specificity, it is 5'-CTGCAG-3'. It is always the same whether at the first recognition site or the second in the following example sequence. For Sp1, which has multiple (16) sequence specificity as shown above, the two recognition sites in the following example sequence fragment are at 18 and 32, and their respective recognition sequences are 5'-GGGGCGGAGC-3' and 5'-TGGGCGGAAC-3'.

5'-AACGTTAGCTGCAGTCGGGGCGGAGCTAGGCTGCAGGAATTGGGCGGAACCT-3'