User:Larse801/sandbox

=GLT8D1 (Glycosyltransferase 8 Domain Containing 1)= GLT8D1 Gene belongs to the glycosyltransferase 8 family. Glycosyltransferase 8 Domain Containing 1 is a transmembrane protein-coding gene. This gene is generally localized to the Golgi apparatus inside the plasma membrane of the cell, it can also be found within the mitochondria, and extracellular space through being an integral membrane protein. GTL8D1 is found on chromosome 13 spanning 11,599 bases.
 * 1) REDIRECT

GENE
GLT8D1 is a protein that in humans is encoded by the GLT8D1 gene.

Common Aliases  Common Aliases for GLT8D1 are Glycosyltransferase, AD-017, GALA4A, and MSTP139. Located on chromosome 13 Glycosyltransferase 8 Domain containing 1 is neighbored by SPC1 which begins at the intron of GLT8D1 sharing ___ # of bases. Downstream starting at the exon of GLT8D1 is GNL3 which shares 8,452 bases with GTL8D1 final exon.

Transcripts GLT8D1 has 6 transcripts it's longest containing 10 exons.

mRNA A single isoform of GLT8D1 was found which is 769 base pairs and 205 amino acids in length. Glycosyltransferase 8 Domain containing one is located on the negative strand containing many glycosylations and metal-binding sites towards the 3' UT end just before the Poly-A tail. An upstream stop sequence is located before the transmembrane domain (888-890).

=Protein=

Molecular Weight, pI, Amino Acid Composition The primary sequence of GLT8D1 has variance shown by a single nucleotide variation between Guanine and Alanine shifting to Guanine at a higher frequency than Alanine. The molecular mass of this protein was calculated to be 41935 Da.

The Theoretical Isoelectric point for GLT8D1 protein was found using ExPASy database, the isoelectric point was found to be 9.37 which is when the protein is found to be neutral in charge. The high isoelectric point indicates GLT8D1 is a very basic protein. The molecular weight of the GLT8D1 primary protein sequenced unprocessed and unmodified was found to be 41.9 kdal. The compositional analysis indicated the protein is Leucine-rich and showed no charges for any amino acid for the entire protein length, as well as no charges for the N and C-terminus when analyzed separately. The charge distributional analysis there were no positive, negative or mixed charged clusters. There are no charge runs or patterns found. The count run statistics are shown in Figure 1. A single high scoring hydrophobic segment which is also a transmembrane segment was found spanning from amino acid 8 to 22 with a total length of 15 amino acids. Spacing of C, H2N-199-C-38-C-132-COOH is conserved due to the GT8-like-1 subfamily sequence being conserved throughout all orthologs giving this same Carbon spacing. There were no clusters of amino acid multiplets.20 The Periodicity analysis revealed a reoccurring or period of 5 of leucine produces 4 copies at amino acid 17 -36. The spacing analysis has to do with rolling spacing at location 5-199 there is large maximal spacing of Potassium.

Statistical analysis of the protein sequences (SAPS) database gives a compositional analysis of GLT8D1 protein sequence. The charge of the amino acids given can give an indication if they are linked to a motif or a phosphorylation site. The overall composition of GLT8D1 protein gives no indication of on average positive or negative amino acids, this could be due to the length of the protein sequence.21 The N-terminus and C-Terminus were analyzed separately to show if there are amino acid indicators in these sequences.

Domains and Motifs The transmembrane spans from (8-28) followed by a region (66-351), GLT8D1_like_1 represents a subfamily of GT8 which contains multiple putative ligand binding sites (71..74,76,154,169,171..173,198,242..244,285..286,307..308,328,330..331,334), metal-binding sites (171,173,328), and glycosylation sites (249,257). A single-pass type II transmembrane protein located within the Golgi apparatus with a confidence of 4, as well as the extracellular space and mitochondria of the cell with a found confidence value of 2.

Secondary Structure

ExPASy was used to make a schematic illustration of GLT8D1 protein. Several orthologs protein sequences were also analyzed to annotate the human protein, Rbi, Psi, Apo, Bbe. There were two glycosylation sites and 3 metal-binding sites were found on the lumenal side of GLT8D1 protein. The 2ZIP server was used to check for loose zipper domains, none were found for GT8D1. The matrix found using the database Prabi identified a coiled-coil matrix of MTIDK.



Aldi2 was used to predict the expected secondary structure of GLT8D1. Various ortholog GLT8D1 protein sequences were used to verify alpha and beta-helix stretches, Rbi, Psi, Apo, Bbe. NetWheels was than used to make a helical wheel of GLT8D1 protein transmembrane region.



=Gene Level Regulation= Regulation of Expression Regulatory elements for the GLT8D1 gene are GH03J052703 (promotor/enhancer), GH03J053165 (promotor/enhancer), GH03J053141 (enhancer promotor) and, GH03J053140. GLt8D1 contains one glycosylation site (1747-1649) and a single asparagine site (1671-1673).

Promotor The Promotor annotated by GenHancer was GH03J052683 with a score of 640. This promotor’s position is Chromosome 3 from 52717813-52722860 band 3p21.1 its genomic size was found to be 5058. The GH03J052683 gene annotated by GenEnhancer was labeled with CpG islands associated with this promotor, DNAse hypersensitive sites within regulatory elements detected, and Transcription factor binding sites (TFBS) that are in the promotor regulating GLT8D1. CpG Island on the GLT8D1 enhancer was found to be on chromosome 3 52719935-52720335 with a band of 3p21.1 its genomic size is 401. This CpG island has a count of 41 and its C count plus G count was found to be 249.2 Percentage CpG is 20.4% with percentage C or G to be 62.1%.2 The ratio of observed to expected CpG is 1.07.



The ENCODE track on UCSC displays the histone tracks for GLT8D1 for 7 default cell types. Level of expression of GLT8D1’s protein within these cell types is shown with peaks.

RBFOX2 and POLR2A are both high peaked transcription factors with many tissue samples proving their binding in this area. OREG highlights areas that have been studied to verify there are regulatory elements in this area. OREG1614228 on the positive strand identifies with GLT8D1 as a transcription binding site for transcription factor FOS.1 H3K27Ac enhancer is peaking and is highly active on the GM128878 cell line and less active on other cell lines seen.

Transcription Factor Binding Sites Having found the literature curated TFBSs overlapping with GLT8D1’s promotor/enhancer region, they were analyzed using (genomatix) for TF-ChiP signal overlapping with region. If they did overlap with a TF-Chip signal the cell-types that overlap were found.



Protein localization and abundance
A graph of the NCBI GEO results for GDS1331 are shown in Figure 1.1 The expression level of GLT8D1 in these various normal tissues are significantly increased (above the 75th percentile at all data points) in the testis (sample GSM194507, GSM194508, GSM194509) and thyriod (sample GSM192528). In most tissues GTL8D1 has expression levels in the 25th percentile excluding the skeletal muscle which seems to have less expression of this gene.





GDS596 has increased expression levels in the superior cervical ganglion above the 70th percentile. The parietal lobe, frigeminal ganglion, and ciliary ganglion reach or exceed the 50th percentile of expression.

GLT8D1 has the highest expression in the liver with GDS4164 (Canis Lupis Familiaris) ranking in the slightly above the 75th percentile. GDS4164 shows a lower expression percentage in the Cerebrum, Heart, Jejunum tissue.

GDS55072 for high grade prostate cancer showed higher expression in all samples with prostate cancer excluding (GMS1095877), when compared to the control (GSM1095876).

RNA-seq data from NCBI for GLT8D1 gene which shows highest expression in the thyroid, testis and, adrenal gland. This data showed the lowest expression in bone marrow and the pancreas.



Allen Brain Atlas gives expression levels and location of GLT8D1 gene within the Mus musculus brain.

The sagittal plane of a Mus musculus adult male (specimen 05-0597) brain showing where GLT8D1 gene is expressed. High expression levels are highlighted showing high expression in the Hippocampal formation, Cortical subplate, and isocortex.

Immunohistochemical staining is used by The Human Protein Atlas to identify which proteins are in various tissues. Figure 8, gives GLT8D1 expression in various tissues. GLT8D1 mRNA is highly expressed in bone marrow and lymph node tissues. The protein expression is notably lower for bone marrow and lymph node tissue but is only slightly lower than the mRNA expression for male tissues such as the testis. Four antibodies for GLT8D1 can be found on Signma-Aldrich.



Conditional Expression
GDS3510 shows GTL8D1 expression based on CLDN1 concentration. When CLDN1 is overexpressed the expression of GTL8D1 decreases, this may be due to a masking effect.

GDS5416 reflects GTL8D1 expression with varying concentrations of rosemary extract in SW620 colon cancer cells. The higher concentration of this rosemary extract from the SW620 cancer cells the expression of GLT8D1 is lessened.

The expression of GLT8D1 in GDS3578 gives the gene higher expression with beta-catenin depletion on myeloma cell line. When compared to the control the beta-catenin depletion studies showed roughly a 50 percent increase in expression.

Transcript level regulation
A conceptual translation was made using SixFrame database. Lables were put on amino acids with conservation and importantance in possible function.



Multiple sequence alignment of the 5’ UTR was made using Clustal O to identify conserved areas between Homo sapiens and orthologs. The conserved regions identified in red can also be seen in the secondary stem-loop structure.

Structure A secondary structure was identified using mfold database, analyzing this structure with the multiple sequence alignment of UTR shows which sequences that are not translated are still conserved.



Protein Level Regulation
Using DictyOGlyc 1.1 Server the O-glycosylation can be predicted at sites if the potential value at that residue is greater than the threshold. One site was found on GLT8D1 to be O-glycosylated, A Serine located at 120 amino acids in the protein sequence and was assigned a G as indication. The potential value was found to be 0.8734 while the threshold value was 0.4906 giving strong indication of this GlcNAc O-glycosylation for D. discoideum a slime mold. ChloroP 1.1 Server was used to predict chloroplast transit peptides and cleavage sites. No ctp was predicted for GLT8D1 protein sequence.



YinOYand analysis tool indicates O-GlcNAc modifications that may compete with phosphorylation sites for control of GLT8D1’s activation state. When the protein is phosphorylated it is active, as when de-phosphorylated it is inactive. If the protein is O-GlcNAc adding this sugar to it it is harder to be turned back to its ‘on’ phase and be phosphorylated as the sugar blocks the phosphorylation site.

SUMOplot analysis indicates addition of ubiquitin like modification. The motifs are given a score a closer value to 1 indicates probability. SUMOylation can interfere with interactions between a target and its partner and could prevent binding, it can also provide a binding site for an interacting partner, or it can change conformation of the modified target, facilitate or antagonize ubitinization. Big-Pi -predict GPI (Glycosylphosphatidylinsotiol) modification sites. No GPI modification sites were found to GLT8D1 protein sequence.

=Homology=

Paralogs The single paralog of GlT8D1 is GLT8D2 which is known for its function in stem cell growth and maintaining central corneal thickness.

Orthologs GLT8D1 has found orthologs from primates to as far back as prokaryotes such as Bacillus. Bacillus maintains 45% similarity and 25% identity with GLT8D1's nucleotide sequence despite its divergence from homo sapiens being 4290 MYA. The evolution rate of Glycosyltransferase 8 Domain containing 1 is very slow, with the highest corrected divergence value being 139 found with Bacillus.



Rate of Evolution Given GLT8D1 orthologs a relative rate of evolution was able to by using three proteins shared between the orthologs Pan troglodytes, Zonotrichia, Rhinatrema bivittam, octopus vulharis, annd enterococcus sp and formulating a molecular clock based on their corrected divergence.



=Clinical Significance=

A recombinant human GLT8D1 protein (GLT8D1-047H) has been used for alternative splicing within laboratories.

=Suggested Reading= ITIH3 polymorphism may confer susceptibility to psychiatric disorders by altering the expression levels of GLT8D1. Sasayama D, et al. J Psychiatr Res, 2014 Mar. PMID 24373612 It might also be helpful to know the association between hip osteoarthritis susceptibility loci and radiographic proximal femur shape. Lindner C, et al. Arthritis Rheumatol, 2015 May. PMID 25939412. Comprehensive integrative analyses identify GLT8D1 and CSNK2B as schizophrenia risk genes. Yang CP, et al. Nat Commun, 2018 Feb 26. PMID 29483533.