Coiled-coil domain containing 42B

Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.

Locus
CCDC42B gene is located on the plus strand of chromosome12 at position 24.13 of the long arm. CCDC42B gene starts at 113,587,663 base pairs and end at 113,597,081 base pairs. Part of CCDC42B overlaps with DDX54 gene (113,594,978-113,623,284). The size of CCDC42B is 9,419 bases and its molecular weight is 35,914 Da. CCDC42B mRNA contains 1514 bp and located from 113,587,663 to 113,597,081. CCDC42B protein contains 308 AA and located from 113,587,663 to 113,595,484. The promoter region (GXP_642107) contains 859 bp is predicted to be located from 113,586,906 to 113,587,764. Human CCDC42B gene has three neighbor gene: DDX54, RASAL 1,and DTX1.

DDX54 gene is member of DEAD protein family of Putative RNA helicases. The gene encodes DEAD box Protein which has a conserved motif of Asp-Glu-Ala-Asp (DEAD). The DEAD box protein family is associated with cellular processes that involve RNA secondary structure alteration such as RNA splicing, ribosome assembly, Initiation of translation, Nuclear and mitochondrial splicing, Spermatogenesis, embryogenesis, and cell growth and division. The RASAL 1 protein is member of GAP1 family that function in suppressing Ras function by inactivating GDP-bound form of Ras which permit the control of cellular proliferation and differentiation. DTX1 function as ubiquitin ligase protein by facilitating ubiquitination and allowing degradation of MEKK1. The ubiquitin ligase activity of DTX1 regulates the Notch Pathway, a signaling pathway that is associated with cell-cell communications that regulates cell-fate determination.

Conservation
The Basic Alignment Search Tool (BLAST) of human CCDC42B protein-to-protein database including Mammalia for closely related species, and excluded Mammalia for distantly related species resulted in several orthologs species with reasonable E-value, and high, medium and low coverage depending on the relatedness of orthologs to human CCDC42B. Higher conservation of CCDC42B gene resulted in several strict orthologs (mammalian) of percentage identity range of 95%-53%: rhesus monkey, whale, pig, cattle, and mouse. Lower conservation of CCDC42B gene in distant homologs (non-mammalian) of percentage identity range of 23%-40%: Drosophila, reptile, amphibians and fish.

Paralogs
CCDC42B gene has only one major paralogs CCDC42'(CCDC42A)

Orthologs
Human CCDC42B gene is found in ~58 orthologs species. CCDC42B higher conservation in many mammalian orthologs species compared to non-mammalian orthologs species. Higher conservation of CCDC42B gene in several strict orthologs (mammalian): chimpanzee, rhesus monkey, dog,cow, mouse, rat and chicken, and identities that range between 95%-69%. Lower conservation of CCDC42B gene in distant homologs (non-mammalian): birds, reptile, amphibians and fish and identities that range between 23%-40%. The figure shows comparison between strict orthologs and distant homologs for conservation of CCDC42B (purple color: matched amino acid residues ; blue: conserved residues ; pink: similar residues ; white: different residues )



Phylogeny
According to Biology Workbench, a phylogenetic tree was constructed showing the divergent of CCDC42B across species.The percent identity vs. the divergent time of orthologs species compared to human sequence is shown below. The figure illustrates the evolutionary history of CCDC42B gene in various species (shown in the orthologs space). The closely related species has higher percent identity, which provides statistical evidence for higher amino acids conservation.Distantly related species to human CCDC42B showed lower percent identity, which supports the few conservation of amino acid residue. The figure highlights the amount of changes occurred in CCDC42B evolution and rate of mutation in the gene.

Protein
According to SAPS tool, Human CCDC42B protein is composed of 308 amino acids of 8 exons. The mature form of CCDC42B protein has molecular weight of 35.9 kdal (35,914 Da). The isoelectric point for human CCDC42B is 7.01, in which CCDC42B protein carries no net charge at that particular pH. The N-terminal of the protein sequence is composed of Met (M). The grand average of hydropathicity was predicted to be -0.694 for CCDC42B (Human) and -0.398 for Drosophila melanogaster CG10750, distantly related orthologs. The negative GRAVY confirms that both proteins are soluble and hydrophilic. The theoretical instability index (II) for CCDC42B is predicted to be 63.73 and for CG10750 is 45.20, which indicate that, both proteins are instable in a test tube. The half-life of is predicted to be 30 hours for both CCDC42B and CG10750 in mammalian reticulocytes (in vitro), which correspond to half-life for enzymes responsible for controlling metabolic rate. The above results confirmed that both CCDC42B and CG10750 share similarities in amino acid composition and protein characteristics. Thus, many characteristics of CCDC42B have been conserved across closely and distantly related species.

Primary sequence & variants/isoforms
Human CCDC42B gene contains 9 introns and 8 different mRNA transcripts are produced: 4 alternatively spliced variants and 4 un-spliced variants. Alternative splicing results in encoding 2 very good proteins, 3 good proteins and 3 non-coding proteins.

Domains and motifs
CCDC42B protein of unknown function contains coiled-coil domain of unknown function (DUF4200) that belongs to Eukaryote family and located at range of 34-159 amino acids. The DUF4200 domain has been conserved in Eukaryote. Coiled coil structure consists of two alpha helices wrapped around each other to form a twist. Heptad repeat pattern (abcdefg)n forms the sequence of coiled coil structure, where a and d are hydrophobic, e and g are polar of charged.

Post-translational modifications
ExPASy Proteomics Tool was primarily used to analyze post-transcriptional modifications of CCDC42B protein. Human CCDC42B N-terminus Acetylation (A2) corresponded in 5 out of 6 orthologs. Drosophila has no Ala, Gly, Ser or Thr at position 1–3, thus N-terminus acetylation is conserved in human CCDC42B. Human CCDC42B protein has conserved SUMOylation site, since lysine (K) at position 285 was conserved in 5 out of 6 orthologs, mostly closely related organisms showed the conservation of lysine. Phosphorylation events occur mostly in CCDC42B, which is suggested to be involved in signaling pathways. Human CCDC42B phosphorylation site of tyrosine at position 8 (Y8) was fully conserved in all 6 orthologs species (the site corresponded with sulfation site). Also other phosphorylation sites in the human CCDC42B protein were conserved in the orthologs (illustrated in the multiple sequence alignment). The same amino acid residues in human CCDC42B protein are subjected to competing phosphorylation and O-linked glycosylation.However, glycosylation sites occur mostly in serine and threonine residues that would be phosphorylated by serine/ threonine kinases. Thus, phosphorylation of the Ser/Thr residues would prevent O-GlcNAc from processing. Human CCDC42B protein has conserved GPI-modification site of Alanine (A) at position 293 that was conserved in 4 out of 6 orthologs.

Secondary structure
CCDC42B protein form a secondary structure based upon alpha-helices. The structure of CCDC42B is predicted to contain several alpha-helices, and other random coils. Hairpin loop structures were detected at the 5'UTR and 3'UTR region of CCDC42B. Also, leucine zipper domain was found overlapping with coiled-coil domain. The attached image shows comparison between human CCDC42B and 5 other orthologs species which supports that human CCDC42B is primarily composed of alpha helices for its secondary structure.

3° and 4° structure
According to CBLAST, the CCDC42B protein sequence was aligned with 2I1K_A (Chain A, Moesin From Spodoptera Frugiperda Reveals The Coiled-Coil Domain At 3.0 Angstrom Resolution), and an E-value of 1.00e-03 was obtained. The aligned sequences from 164 to 243 AA for CCDC42B, and 302-381 AA for 2I1K_A resulted in 22% identity between both sequences in 80 amino acid residues.The structure shows only the aligned sequence of CCDC42B with 2I1K_A. Predicted structure (blue: not similar residues, red: conserved residues, gray: not aligned CCDC42B residues with 2I1K_A).

Expression
Human Protein atlas resulted in CCDC42B expression in normal human tissue. The expression level of CCDC42B gene in human normal tissues was detected at high to moderate level in 17 out of 78 tissues analyzed using Expressed Sequence Tag (EST) technique. CCDC42B gene has a narrowed expression in tissues. The gene has higher expression in respiratory epithelia and fallopian tube; Moderate expression in intestine and liver; and low to none expression in other normal tissues. Moreover, Microarray and Immunohistochemistry (IHC) expression detected presence of low level of CCDC42B mRNA expression in: salivary gland, stomach, skin, bone marrow, and lung. Coiled coil domain containing 42B is involved in cancer; CCDC42B gene is expressed in low to moderate level in tumor cell.

Promoter and Transcription Binding Factors


According to Genomatix, the Promoter region contains 859 base pairs and it is located on the positive strand of chromosome 12 from region 113,586,906 to 113,587,764 upstream of CCDC42B gene. The promoter region was predicted to contain sites for transcription binding factors that regulate expression of CCDC42B. The Attached image illustrate important transcription binding factors in the promoter region for human CCDC42B.

Expression
CCDC42B gene has a narrowed expression in tissues. The gene has higher expression in respiratory epithelia and fallopian tube; Moderate expression in intestine and liver; low to none expression in other normal tissues. Coiled coil domain containing 42B is involved in some types of cancer. CCDC42B gene is expressed in low to moderate level in tumor cell.

Function / Biochemistry
According to year 2014, CCDC42B gene/protein has unknown function in homo sapiens. However, Human CCDC42B is predicted to be involved in flagella assembly and motility.

Interacting Proteins
According to STRING, MINT, and IntAct, Human CCDC42B did not show any direct interaction with other proteins. Searching GeneMania, other interactions have been identified by co-expression with other proteins as seen in the figure. CCDC42B was found to co-express with other coiled-coil domains containing proteins (CCDC78 and CCDC153). Since Human CCDC42B is expressed in low level in testis, it is predicted that human to interact with SPATC1 (Spermatogenesis and centriole associated 1).

Disease Association
Human CCDC42B is located at chromosome 12 (12q24.13), which is linked to skeletal deformities, hypochondrogenesis, achondrogenesis, and kniest dysplasia. According to OMIM search chromosome 12 (12q24.1) is linked Noonan syndrome 1 that is caused by heterozygote mutation in PTPN11 gene product, SH-PTP2, and primarily causing facial developmental defects and heart defects.

Mutations
Two SNPs (Y8, Q280) are highly conserved in many orthologs species. Thus, these residues can change function of protein leading to possible disease not only in human.

Conceptual Translation
Major predicted domains, post-transcriptional modification sites, and structural form are shown in the conceptual translation