- Source: ORF3c
ORF3c is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It was first identified in the SARS-CoV-2 genome and encodes a 41 amino acid non-structural protein of unknown function. It is also present in the SARS-CoV genome, but was not recognized until the identification of the SARS-CoV-2 homolog.
Nomenclature
There has been significant confusion in the scientific literature around the nomenclature used for the accessory proteins of SARS-CoV-2, especially several overlapping genes with ORF3a. The predicted protein product of the ORF3c gene has at least once been referred to as "3b protein", but it is not to be confused with the non-homologous gene ORF3b. It has also been described under the names ORF3h and ORF3a.iORF1. The recommended nomenclature for SARS-CoV-2 uses the term ORF3c for this gene.
Comparative genomics
ORF3c is an overlapping gene whose open reading frame overlaps both ORF3a and ORF3d in the SARS-CoV-2 genome. This potentially represents a rare example of all three possible reading frames of the same sequence region encoding functional proteins.
Bioinformatics analyses of Sarbecovirus sequences suggest that the sequence and length of ORF3c are well conserved, indicating that it is likely to encode a functional protein. It appears to be subject to purifying selection.
Properties
Ribosome profiling experiments confirm that the ORF3c gene expresses a protein product. The relatively short 41-residue protein is predicted to contain a transmembrane domain and has features suggestive of a viroporin.