CCDC180

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

Locus
CCDC180 is located on chromosome 9 at the locus 9q22.33.

Common aliases
CCDC180 is also known by the aliases KIAA1529, BDAG1 (Behçet's Disease Associated Gene 1), and C9orf174.

Gene features
The CCDC180 gene is 71,221 bases long. It contains 37 exons and is oriented on the forward strand of the chromosome.

mRNA
There are no known isoforms or alternative splicing variants of the CCDC180 mRNA.

General features
CCDC180 contains 1,701 amino acids and has a molecular weight of 197.3 kDa. The isoelectric point (pI) is 5.74. The low pI is attributed to a relatively high concentration of glutamic acid when compared to other human proteins at 12.9%. CCDC180 also contains a relatively low concentration of glycine when compared to the average human protein at 3.5%.

Domains
CCDC180 contains two domains of unknown function (DUFs): DUF4455 and DUF4456. There are also two coiled-coil regions which overlap with the DUFs. There is a region of low complexity that is very rich in glutamic acid.

Secondary and tertiary structure
The secondary structure of CCDC180 is predicted to be almost completely composed of alpha helices, with only a few predicted beta sheets. The tertiary structure is not completely characterized as yet, but a model predicted by the I-TASSER server at the University of Michigan is pictured.



Post-translational modifications
CCDC180 is predicted to undergo a variety of post-translational modifications:
 * Phosphorylation on serine, threonine, and tyrosine residues
 * Tyrosine sulfation
 * Sumoylation
 * O-linked β-N-acetylglucosamine modification of a serine residue

Subcellular localization
CCDC180 is predicted to localize to the nucleus, and it contains four nuclear localization sequences.

Expression
CCDC180 is expressed ubiquitously at low levels throughout the body, and the highest expression is consistently seen to be in the testes. Other replicated tissues of high expression include the trachea and eye.

Transcriptional regulation
Transcription of CCDC180 is predicted to be regulated by a 664 base pair promoter region, with the ID GXP_1829211. This prediction is supported by the transcripts GXT_23217882, GXT_24495001, GXT_24495002, and GXT_24495003. Transcription factors predicted to bind to this promoter region are described below.
 * Ccaat-enhancer binding protein
 * KRAB domain zinc finger protein 57
 * Krüppel-like C2H2 zinc finger factors
 * Octamer binding protein
 * SRY box 9
 * GLI zinc finger family
 * RXR heterodimers
 * SOX factors
 * E-box binding factors
 * Nerve growth factor-induced protein C
 * Myc-associated zinc finger
 * GC-binding factor 2
 * X-box binding protein 1
 * Histone nuclear factor P

Interacting proteins
The following proteins have been shown to interact with CCDC180 in yeast two-hybrid assays.

Clinical significance
A single-nucleotide polymorphism (SNP) in the gene that leads to a single amino acid change (S995C) has been shown in a genome-wide association study to be significantly associated with Behçet's disease, and this designation led to the alias Behcet's disease-associated gene 1 (BDAG1). The role of CCDC180 in the disease phenotype is unknown.

Homology
There are no paralogs in humans for this gene, but there are orthologs in a wide variety of organisms, extending back to single-celled green algae. CCDC180 is not conserved in bacteria, archaea, plants, fungi, or protists. The following table includes a subset of species containing protein orthologs of CCDC180. It is not exhaustive, but it indicates the variety of species containing orthologs of CCDC180.

Evolutionary history
CCDC180 is a relatively quickly-evolving gene compared to other well-known genes. There are no known family members, splice variants or isoforms, or evidence of gene duplications in the history of the gene.