Fam78b

Family with Sequence Similarity 78-Member B (FAM78B) is a protein of unknown function in humans that is encoded by the FAM78B gene (1q24.1). It has orthologous genes and predicted proteins in vertebrates and several invertebrates, but not in arthropods. It has a nuclear localization signal in the protein sequence and a miRNA target region in the mRNA sequence.

Homology
FAM78B has one paralog, FAM78A, and is conserved throughout many species. Orthologs can be found throughout all vertebrates excluding arthropods. FAM78B is also found in several invertebrates including the pacific oyster and liver fluke. FAM78A, it’s paralog, is also found to be conserved in more invertebrates such as the tunicates, worms, and leeches, and make up the distant homologs of FAM78B. The table below contains a list of FAM78B orthologs with percent identity values and time since divergence values relative to the human FAM78B gene or protein.

Gene
The FAM78B gene is located on the sense (negative) strand of chromosome 1 at location 1q24.1 and spans the chromosomal locus 166039271-166135909, covering a total of 96,638 base pairs along the chromosome, the FAM78B gene has 2 exons in its transcript mRNA of 1,481 bp. FAM78B in humans is separated into two exons that have 95,243 bp of introns between them. The gene is highly conserved in vertebrates (excluding arthropods) and the pacific clam and liver fluke.

mRNA
There is one isoform that has been identified in humans and is composed of two exons that composes a mRNA of 1481 bp.

Protein
The FAM78B protein has a calculated molecular weight of 30 kDa, has a higher relative abundance of tryptophan (W), has a more greatly conserved c-terminal region, is composed of both alpha helix and beta strand, and resides in the nucleus of the cell after transcription

General properties
The protein FAM78B consists of 254 amino acids with a predicted molecular weight of 30 kDal. The protein has an isoelectric point of 9.6. FAM78B has a highly conserved C terminus among its orthologs and is histidine poor. The highest conserved amino acids are ISDSDG from aa 104-110, WLVA from aa 171-175, VDP---L--R from aa 199-208, and the C’ terminus, but especially NADQVLMW from aa 240-247.

Conservation
The amino acid sequence for FAM78B is highly conserved in mammals, having around 86% to 100% sequence similarity. Birds, frogs, mammals, and lizards also have a high degree of similarity to the human FAM78B sequence with similarities between 76% and 83%. Fish have between 56% and 66% sequence similarity. The C terminal end is the most highly conserved across ortholog-containing species from mammals to the pacific sea clam. [4]

mRNA level
There is one miRNA binding site targeted by miR-24 for sequence CUGAGCCA in Homo sapiens located on the 3' end of the mRNA at 88-95 after the stop codon (bp 167,091,390-167,091,397 on chromosome 1). Stem loop from 155-172 of the 3' end of the mRNA matches with the miRNA site.

Protein level
Conserved nuclear localization signal (RPKR) from aa 248-252.

Expression
FAM78B is generally ubiquitously expressed and is highly expressed in regions of the brain.

Clinical relevance
FAM78B is statistically significantly correlated to chronic kidney disease when there is one of three different single nucleotide polymorphisms (SNPs) including two located in the intron (rs2116519 and rs4074897) and one located in the 5’ UTR (rs987131).