User:Maeylo/sandbox

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

Aliases
SMCO3 has 2 aliases, C12orf69 and LOC440087.

Location
SMCO3 is located on the negative strand of chromosome 12 (12p12.3) and spans 10,460 base pairs (chr12:14,803,723-14,814,182). It has 2 exons that flank a single intron.

Gene Neighborhood
SMCO3 is flanked by WW domain binding protein 11 (WBP11) and Ecto-ADP-ribosyltransferase 4 (ART4) on the minus strand and overlaps with C12orf60 on the plus strand. There is only a single isoform of this gene.

Expression
SMCO3 is expressed in very low levels in several different human tissues including cervix, connective tissue, eye, lung and prostate. This highest expression of SMCO3 is seen in the kidney, liver and spleen. SMCO3 is also expressed at higher levels in cancers, especially chondrosarcoma and clear-cell renal cell carcinoma. SMCO3 expression is only seen in the fetus and adult and not in the embryoid bodies, blastocyts, infants and juveniles stages of development.

The expression of SMCO3 appears to depend upon the species, with the Mus musculus homolog of SMCO3 expressed at much higher levels in the eye compared to humans.

Promoter
The promoter region of SMCO3 is 1,100 base pairs long and begins 961 base pairs upstream of the 5' UTR with the end of the promoter completely overlapping the first exon.

Variants
There are 2,152 known nucleotide-level variants of which 27 are coding synonymous single nucleotide polymorphisms. The vast majority of single nucleotide polymorphisms (SNPs) occur within the intron with only a quarter occurring translated regions. No SMCO3 variants are known to be associated with any disorder.

Splice Variants
The mRNA transcript of SMCO3 is 2,104 base pair long. There are no mRNA variants of SMCO3 .

Regulation
The SMCO3 promoter has many transcription factors binding sites including for cartilage homeoprotein 1, cAMP-responsive element binding proteins, PAR/bZIP family and vertebrate TATA binding protein factor.

General Properties
SMCO3 is 225 amino acid long with a predicted molecular weight of 24.9. It is a slightly basic protein with a predicted isoelectric point of 8.3.

Composition
SMCO3 is comparably enriched in lysine and comparably poor in proline and phenylalanine compared to other human proteins. SMCO3 contains several long, uncharged segments but does not have any significantly charged segments. Despite being a transmembrane protein there are no significantly hydrophobic regions nor any significantly hydrophilic regions.

Domains and Motifs
SMCO3 has a single domain, DUF4344 (aa15:221) which is currently uncharacterised. C12orf60 also contains this domain. It contains a single transmembrane region (aa155-175) and has two coiled-coil regions (aa62-92, aa183-207). The C-terminus of SMCO3 contains a KKXX-like motif suggesting endoplasmic reticulum localisation.

Structure
The secondary structure of SMCO3 consists of several α-helices and coiled coil regions and a single β-pleated sheet. Orthologs of SMCO3 similarly show secondary structure dominated by alpha helices. The tertiary structure consists of three large helices, six smaller helices and one small pleated sheet.

Conservation
The amino acid sequence of SMCO3 is highly conserved compared to other human proteins. There is dramatically lower levels of sequence divergence than expected, even compared to proteins known to have low levels of sequence divergence with time.

Homology
SMCO3 in largely conserved in amniotes. Orthologs have been identified in many mammals, reptiles and birds. The closest ortholog is found in Pan troglodytes and has a 99.7% sequence similarity. More distant homoglogs have also been identified in a select few bony fish but orthologs are not seen in cartilaginous fish, insects or other invertabrates. No paralogs of SMCO3 in humans have been identified.

Post-Translational Modifications
The N-terminus of SMCO3 is cleaved, the first methionine residue removed and the N-terminus acetylated to improve stability. Additionally there are several sites that are likely phosphorylated and a single N-linked glycosylation site which is typical in ER integral membrane proteins. Unlike typical ER integral membrane proteins there is no amino-acid signal sequence.

Sub-Cellular Localisation
SMCO3 contains a transmembrane domain (aa155-175). Additionally the KKXX-like motif highly suggest that it is an endoplasmic reticulum integral membrane protein.

Interacting Proteins
SMCO3 is known to interact with five proteins: FUS, MAPK9, OBFC1, PPP2CA and TRIM39 however it is not known to take part in any pathway although the structure indicates that it takes part in protein-protein interactions. PP2CA, OBFC1, FUS1 and MAPK9 are all either implicated in cancer or have altered expression in cancer which suggests that SMCO3 may be useful as an eQTL for certain cancers.

Mutations
Only 3.4% of SNPs were predicted to be deleterious, of which none had any clinical significance.

Disease Associations
GWAS showed no significant associations of SMCO3 with any disease or traits. SMCO3 is not known to be implicated in any disease. SMCO3 is expressed at higher levels in certain cancers, especially chondrosarcoma and clear-cell renal cell carcinoma.