- Source: PRP36
PRP36 (Proline Rich Protein 36) is an extracellular protein in Homo sapiens that is encoded by the PRR36 (Proline Rich Region 36) gene that contains a domain of unknown function, DUF4596, towards the C terminus of the protein. The function of PRP36 is unknown, but high gene expression has been observed in various regions of the brain such as the prefrontal cortex, cerebellum, and the amygdala. PRP36 has one alias: Putative Uncharacterized Protein FLJ22184.
Gene
The human PRR36 gene consists of 7 exons and is 5723 base pairs long.
= Locus
=PRR36 is located on the short arm of human chromosome 19 at 19p13.2 (region 1, band 3, and sub-band 2). The gene spans between base pair numbers 7868719 and 7874441 on chromosome 19 and is located between two other genes—LYPLA2P2, a pseudogene, and EVI5L, a gene which produces a protein that regulates Rab GTPase activity.
= mRNA and splice variants
=Alternative splicing of the PRR36 gene results in two transcript variants. PRR36 (FLJ22184) Transcript Variant 1, seen in the image below, is 4518 base pairs long and consists of six exons, of which the last five are utilized in protein coding. The protein produced, PRP36, is made up of 1346 amino acids. PRR36 Transcript Variant 2 is 780 base pairs long and consists of five exons. PRR36 Transcript Variant 2 theoretically encodes a protein 260 amino acids in length. However, it is currently suspected that this variant transcript never gets translated.
PRR36 Transcript Variant 1 has only been found to have only one polyadenylation site.
Protein
= Domain and motifs
=DUF4596 on human PRP36 is 47 amino acids long, has an isoelectric point of 3.77, and is almost completely conserved across mammals. Despite lacking a signal peptide, PRP36 is predicted to be excreted from the cell after it undergoes processing.
A few different tandem repeats, separated repeats, and repeated sequences exist throughout PRP36. These repeats are observable in primate PRP36 orthologs but are absent in PRP36 orthologs from more distantly related species such as the opossum, suggesting that some form of evolution has been occurring throughout the PRP36 sequence in relatively recent history.
= Composition
=PRP36 is 1346 amino acids long and is proline rich, meaning that a greater proportion of proline residues exist throughout the protein, including the DUF4596 domain, in comparison with other human proteins. Proline rich proteins are often observed to be intrinsically unstructured and have been connected with protein-protein interactions in signaling pathways. However, it isn't certain whether these traits hold true in PRP36. In PRP36 the amino acids isoleucine, tyrosine, and asparagine are present at a decreased proportion compared to a typical human protein. Two highly positive sequences exist towards the N terminus of PRP36 while a highly negative sequence exists within the DUF4596 domains towards the C terminus. As a whole, however, PRP36 appears to be a slightly basic and overall positively charged protein, as it has a corresponding isoelectric point of 10.98. PRP36 is a polar and soluble protein.
= Post-translational modifications
=PRP36 is predicted to contain 24 phosphorylation sites in humans, including 14 serine, 9 threonine, and 1 tyrosine site. Additionally, there are 8 predicted N-Acetylglucosamine attachment sites and 2 highly conserved predicted SUMOylation sites.
= Secondary structure
=PRP36 secondary structure has not been explicitly determined, but predictions based on the PRR36 mRNA give some possibilities. Alpha-helixes, beta sheets, and other structure characteristics fail to be conserved across PRP36 orthologs with the exception of an alpha-helix alpha-helix beta-strand beta-strand motif that was highly conserved across mammals. This motif begins slightly before and carries into the DUF4596 region, suggesting a high importance for this domain in PRP36 function.
= Interacting proteins
=PRP36 has medium scores for predicted interaction with two other proteins of unknown function, OVCH1 and FAM179A. These predictions, however, have not been experimentally determined, so the confidence of protein-protein interaction with PRP36 isn't very high.
= Cellular location
=No signal peptide or other marker is predicted to exist with the PRP36 sequence. However, according to Phobius, PRP36 is predicted to be a non-cytoplasmic protein existing the extracellular space. Assuming this prediction is correct, this might indicate that PRP36 undergoes unconventional protein secretion.
Expression
= Promoter
=A single promoter is predicted to exist by Genomatix for the PRP36 protein. This promoter exists on the negative strand from position 7939226 to 7939826 and is 601 base pairs in length. The PRP36 promoter region contains a number of predicted transcription factors of various types including various zinc fingers, E2F factors, and CDF factors. Of particulate note is the presence of a XGene Promoter Element on the minus strand which is a mediator of RNA polymerase II for promoters lacking a TATA box, as is the case for the PRP36 promoter. The following table gives 12 transcription factors that interact with PRP36 as predicted by the ElDorado tool from Genomatix—all shown factors received a minimum Matrix Sim score of 0.877.
= Expression
=Unigene's EST cDNA Tissue Abundance display and Protein Atlas shows PRP36 as having significant expression levels in the brain, embryonic tissue, eyes, intestines, kidneys, nerves, and ovaries. Additional evidence supports some of these findings, as analysis of normal tissues revealed that over 50% of the cells in the cerebellum, fetal brain, prefrontal cortex, and superior cervical ganglion expressed PRP36. PRP36 appears to be over-expressed in cell samples taken from patients with ductal carcinomas of the mammary gland, suggesting that the disease state and PRP36 expression might be connected.
Homology
= Orthologs
=PRP36 has no known paralogs in humans, but a number of orthologs were found to exist in species throughout the mammalian kingdom. PRP36 is highly conserved across primates, but a few short sequences unique to the human version of the gene do exist. Based on the lack of conservation across all mammals a rapid evolution for PRP36 can be suggested. However, the DUF4596 region is highly conserved across mammals, suggesting that the domain is critical to PRP36 function while the rest of protein is more easily manipulated without leading to harm. A list of orthologs for PRP36 can be found below
= Evolution
=Multiple sequence alignment suggests that PRP36 evolved early in mammalian lineage. Mammals very distantly related to human beings, such as the opossum, have a version of the PRP36, suggesting that the protein came about prior to that evolutionary divergence. However, with exception to the DUF4596 domain, very few areas within the PRP36 sequence are conserved.
Clinical significance
At this time, the function of the PRP36 protein is not known. However, some speculation of the function can be made. In 2009, it was discovered that the source of a patient's phenotypes was a genetic condition involving a 19p13.2 microdeletion—a very small piece of chromosome was missing from the patient (the entire 19p13.2 region was not missing). Additional diagnoses have since been made, and a few patients have been found to have microdeltions that involve the region in which the PRR36 gene is found, meaning the PRP36 protein would not be found in these individuals. However, this region also included other genes whose functions are well known; for example the obesity observed in the patients can be traced to the deletion of the insulin receptor gene. Other symptoms, such a learning disabilities and speech impediments can be tied to similar gene deletions. However, it is possible that PRP36 absence causes a minor disability that is masked by these other symptoms. Additionally, it is possible that PRP36 plays a secondary role with one or more of these other deleted genes. This second option can be slightly supported by noting that other proline rich proteins that have known function, both on human chromosome 19 and other chromosomes, tend to more frequently produce proteins that are involved in protein-protein interactions than many other general types of genes.