FAM222A

Family with sequence similarity 222 member A or Aggregatin is a protein of unknown function. In humans it is encoded by the gene FAM222A. Aggregatin's cellular function is not well understood, however it has been implicated in Alzheimer's disease.

Gene
FAM222A is also called C12orf34. It is located on chromosome 12 at q24.11. It encompasses 56,672 bp. The mRNA is 3,685 bp while the coding region is 1,359 bp.

FAM222A is highly expressed in the brain and spinal cord. It is expressed to a lesser extent in the cerebellum, pituitary gland, adrenal gland and testis.

mRNA
It has 3 different splice variants of mRNA. The most common mRNA is 3,685 bp while the coding region is 1,359 bp. The mRNA consists of three exons and has two different isoforms in humans. The Kozak Sequence is not very well conserved in FAM222A and it has a non-canonical polyadenylation site.

Protein
Aggregatin is a protein made of 452 amino acids. It contains a domain of unknown function called pfam15258 which is 200 amino acids long.

It has been found to be part of protein plaques formed in the brains of patients with Alzheimer's disease. FAM222A has an unusually high amount of prolines with a 6 segment run from amino acids 392 to 397.

Structurally, FAM222A has 5 domains which are connected by linker regions.

Analysis of the amino acid sequence suggests that FAM222A is localized in the nucleus.

Expression and regulation
FAM222A is highly expressed in the brain and to a lesser extent in the adrenal glands.

Alzheimer’s disease seems to cause an increase in FAM222A in the brain, but other degenerative diseases such as Parkinson’s do not.

Interacting proteins
FAM222A has been found to interact with mainly transcription factors. These include mainly  pre-B-cell leukemia transcription factors and Homeobox Meis proteins.

Homologs
FAM222A has only one paralog in humans, FAM222B which is also not well characterized. These two proteins only share about 20% identity.

It has many orthologs in other organisms but is restricted to jawed vertebrates, as far back as bony and cartilaginous fish. Overall the protein is well conserved with a lowest identity of around 50% but certain regions are very strictly conserved such as the beginning of pfam15258 as well as the last 60-70 amino acids on the C terminus.

The protein appears to be changing very slowly even in distantly related animals. It is changing at a rate just slightly higher than Cytochrome C, a highly conserved protein.]