User:Student of biological sciences/sandbox



= Gene = C4ORF19 is located on 4p14 and contains 7 exons along its length of 142692 base pairs. C4ORF19 has the alias FLJ11017. Neighboring genes include MIR4801, NWD2, RELL1, and ENSG00000248936.

= mRNA = For Homosapiens there are two mRNA transcripts present, such that both transcript variant mRNAs 1 (NM_001104629.2) and 2(NM_018302.3) both encode for the same protein c4orf19. NM_001104629.2 stretches out for a length of 3643 base pairs, predicted to having 4 exons and NM_018302.3 is 3633 base pairs long containing predicted to having 3 exons. Variant mRNA 1 being larger, is the central isoform used in the analysis.

= Structural and Molecular Properties of protein = A single isoform of the c4orf19 protein present in Homo sapiens known as uncharacterized protein C4orf19 isoform X1 ( XP_011512015.1) and it is 314 amino acids long. The protein holds a theoretical molecular weight of approx. 34 kilodaltons and an isoelectric point of 4.35, suggestive of it likely being acidic, , ,. Table 1. Generical properties of c4orf19 protein in Homo sapiens and its orthologs in other species regarding structure and composition. The conserved DUF domain presented in figure 1. and this table for Homosapiens and other orthologs alike is "Domain of Unknown Function 4699"(accession number pfam15770). Table and figure 1 provide a base level understanding of the nature of the domain and motif specifications present for the c4orf19 protein.

Fig. 2 presents a prediction of the secondary structure based off in a multiple sequence alignment manner via usage of the Ali2D program available on the MPI Bioinformatics Toolkit website under the tab"2ary Structure". The boldness of the coloring indicates the relative confidence levels for the predictions regarding the specific types of qualities present to secondary structure. "Red" and "blue" coloring are suggestive of alpha-helix and beta-sheets respectively in the predicted secondary structure.

Fig.3 In the 2NBI_A protein the cysteine at position 193 shares a disulfide bond with another cysteine at position 263. The position values are with respect to the plus strand direction. Zoomed in image of 2NBI_A.

Fig. 4 Complete Image indicating the overall 3-D structure available on iCn3D program for the 2NBI_A protein which is described as being most closely related to the c4orf19 protein structure-wise.

= Regulation =

Promoter and Transcription Factor Binding Sites
Fig. 5 A detailed depiction of transcription factor binding sites along the length of the promoter for the "+" and "-" strands. An indication of where transcription & translation begin is provided.

Fig. 6 Designation of specific sites along promoter presented based off of letter and/or underline coloring for each individual annotation.

Expression Pattern
Fig. 7 Heightened levels of RPKM present for the colon, stomach, small intestine, and other closely associated digestive tract areas relative to other body parts studied. Also, moderate levels of RPKM are observed the liver and urinary bladder.



Fig. 8 Moderate levels of RPKM are observed for the digestive tract areas such as the small intestine and stomach, while the placenta and kidney are placed to be within the highest ranges of RPKM content. Unknown whether the colon was assessed.

Fig. 9 In Mus musculus colon, stomach, and other closely associated digestive tract areas have relatively higher levels of RNA material overall present along with a few external regions such

as the kidneys and liver with moderate levels of RNA count.

Fig. 10 For Homo sapiens there overall seems to be higher level of expression meaning being present in the highest quartile level for multiple different areas. For example in the trachea, as well as the colon and other digestive tract areas, reproductive system areas (uterus and placenta), and filtering regions such as the liver and pancreas



Fig. 11 Comparison of c4orf19 expression for different levels of H1F-alpha presence in Homo sapiens. Overall there are no entire quartile level differences observed between the control group trials and the three experimental group trials in terms of the expression for c4orf19 (blue). No significant variation exists between each of the four individual protocol groupings as well.

Imaging of patient tissue with whitish coloring indicating antibody marker. Visible levels of antibody marker indication present within the cell nucleus regions for the c4orf19 protein (moderately present), with the extracellular space surrounding the cells having lower concentrations of the protein. ]].

Protein based Control
Table 2. PSORTII Prediction Results for c4orf19 protein subcellular localization.

Fig. 13 Chart representation of likelihood of signal peptide region occurrence within Homo sapiens c4orf19 protein.

The overall likelihood for a signal peptide sequence being present on the protein is low.

Fig. 14 Protein localization probabilities for Different Subcellular Regions in Addition to Relative Range that is Membranous Versus Soluble.

Fig. 15 Tree provides probabilities for likelihood of protein subcellular location through visualization of direction of "red" branch connectivity.

Conceptual Translation for Homo sapiens c4orf19 protein can be accessed at: Hsa_c4orf19 Conceptual Translation.

There is currently support for the protein being localized within the subcellular regions of both the nucleoplasm as well as the cell-junctions.

= Evolutionary History and Protein Homology =

There may be a potential paralogous domain, such that part of the c4orf19 protein sequence in homo sapiens in the plus strand direction direction seems to be similar to a sequence from the RELL1 protein on its exon number 7 directed in the minus strand direction. A pair-wise alignment done from the Emboss-Needle tool indicates that there is only a ~10% match between the two sequences, so this might not strongly support cause for there being a shared paralogous domain, but further research may be needed to explore this more deeply

Fig.16 A Graphical and numerical analysis is provided to present the relationship between the c4orf19 protein in Homosapiens and its orthologs in other organisms.

= Functional Roles and Biochemical Information =

The C4orf19 protein's exact function has not been determined clearly, but it may have associative roles with other proteins that need to be examined further. More research needs to be conducted for understanding the biochemical properties of the protein in regards to its functional significance.

= Interactions with other proteins =

Table. 3 Based on the support for the protein not being strictly locked within the nuclear regions of a cell during the time it is active, it seems reasonable to explore whether the C4ORF19 protein has interactions with proteins that take on much of if not all of their activities outside of the nuclear regions of the cell.

= Clinical Significance =

The Homosapiens protein c4orf19 may have a dynamic role in the occurrence of multifocal & multicentral (MMBC) and unifocal breast cancers (UBC) with regards to its level of its gene’s expression. Reverse Transcriptase PCR assessments indicate that the expression of the c4orf19 gene is significantly down regulated in comparison to the genes for the c19orf53, c3orf52, and c15orf48 for cases of MMBC. Whereas an opposite trend is observed with the gene of interest being significantly up regulated in comparison to the others for cases of UBC. This is suggestive of a regulatory-like property between the c4orf19 gene/protein and the others in determining the fate of breast cancer which must further investigated. In addition, real-time PCR testing suggests that based off c4orf19 gene expression, a change to the gene’s activity may promote irregular levels of glutamate accumulation in the brain for multiple sclerosis patients

The c4orf19 protein may be important in understanding biochemical qualities that promote survival in cases of colorectal cancer. Its gene along with 8 others have been designated into the category of differently experssed genes for this type of cancer. Elevated levels of the protein accumulation may be linked to a 10% increase in cell survival from around 100 to 150 months after existence in cell-line populations based off quantitative real time PCR analysis. The p-value asscociated with the obtained result lies at 0.0044 with a sample count of 181

An early increase in the methylation of the c4orf19 gene may occur in infants. There seems to be an increase in methylation at two areas. Regions include 200 nucleotides up stream of the transcription site in the promoter space as well as the 5’ untranslated regions. The expected time frame for this event seems to be from roughly 6 weeks up till about a year

= References =