User:Shocked-Lemur983/sandbox

Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.

=Gene=

Aliases, locus and size
CCDC121 has known aliases of FLJ43364, FLJ13646, hCG_1988995, LOC79635, and coiled-coil domain containing 121.

CCDC121 is located on the minus strand of chromosome 2 at 2q23.3. It is 3,394 base pairs in length.

Isoforms and alternative splicing
CCDC121 produces four different mRNAs: three alternatively spliced variants and one unspliced form. The three alternatively spliced mRNAs give rise to three known protein isoforms. Transcripts for isoforms 1-3 are 2,880, 2,762 and 2,361 base pairs in length respectively. Each of the mRNA variants contains two exons separated by a gt-ag intron.

=Protein=

Primary sequence, molecular weight and pI
Molecular weight and pI were calculated using ExPasy.

Compositional analysis
Compositional analysis of all isoforms shows that they have below-average levels of aspartate (D) and valine (V) and above-average levels of glutamine (Q). In addition, they have above-average levels of lysine (K) and arginine (R) groupings. Isoform 3 also exhibits above-average lysine and glutamine levels and below-average proline and glycine levels. Chimp, dog, and ferret orthologs also exhibited above-average glutamine levels and lysine and arginine groupings.

Secondary and tertiary structure
The secondary structure prediction for CCDC121 was obtained using Ali2D. CCDC121 adopts a predominant alpha helcial secondary structure (shown in red) due to the presence of the Coiled-coil motif.

The tertiary structure of CCDC121 is composed mostly of alpha helices and contains some random coil.

Domains, motifs and post-translational modifications
CCDC121 contains one domain of unknown function, DUF4515 or pfam14988. It also contains three predicted coiled-coil motifs from residues A165 to E192, L264 to E305 and N363 to E397.

CCDC121 is predicted to have post-translational modification sites for: acetylation, Protein Kinase C and Casein Kinase II phosphorylation, glycation, GalNAc O-glycosylation, SUMOylation,  and O-β-GlcNAc attachment.

Subcellular localization
Current evidence suggests that CCDC121 is partially localized in the nucleus. CCDC121 has a predicted nuclear localization signal from amino acids R327 to L337. This sequence has a score of 7, which is consistent with being a partial nuclear protein. In addition, PSORT II found that there is 56.5% chance that CCDC121 is found in the nucleus.

There is also evidence to suggest that CCDC121 partially located in the cytosol. Cytochemistry studies of the Anti-CCDC121 antibody from The Human Protein Atlas indicate that CCDC121 is expressed in the cytosol and actin filaments. These tentative results are promising but further research of other anti-CCDC121 antibodies is needed. Additionally, TargetP did not find a mitochondrial transfer peptide, which suggests that CCDC121 is likely not a mitochondrial protein.

=Expression and Function=

CCDC121 is expressed at the highest levels in the testes, ovaries, prostate, and thyroid. It is expressed 40% less than the average gene so it is considered to have low levels of expression.

The function of CCDC121 protein is not yet well understood by the scientific community. There is no known phenotype associated with the CCDC121 gene.

=Homology=

Rate of molecular evolution
Cytochrome c is a highly conserved protein and fibrinogen is a rapidly evolving protein. CCDC121 has a faster rate of molecular of evolution relative to both these proteins, suggesting that CCDC121 evolves very rapidly on an evolutionary timescale.

Orthologs
There are 126 confirmed orthologs of CCDC121. CCDC121 orthologs are most abundant in mammals. 122 of the 126 orthologs are within the Eutheria, Marsupialia, and Monotremata clades. 119 of the 122 mammalian orthologs are found within eutherian mammals. The four remaining orthologs are found in the two-lined caecilian, the West Indian Ocean coelacanth, the electric eel, and the northern pike. These orthologs represent the Amphibia, Sarcopterygii, and Actinopterygii clades respectively. The CCDC121 gene likely appeared 433 million years ago in a common ancestor of Actinopterygii and Sarcopterygii.

Paralogs
CCDC166 is the only known paralog of CCDC121. They share a 23% sequence identity. Both CCDC121 and CCDC166 include the Domain of Unknown Function 4515 (DUF4515), or pfam14988, as a highly conserved sequence.

=Clincial significance= Mutations in the CCDC121 gene have been found in patients with certain cancers such as endometrial, lung, bladder, gastric/stomach, head/neck, and prostate cancers but no causal relationship has been determined. CCDC121 may also serve as a marker gene for inner ear development.

=References=