- Source: HCONDELs
hCONDELs refer to regions of deletions within the human genome containing sequences that are highly conserved among closely related relatives. Almost all of these deletions fall within regions that perform non-coding functions. These represent a new class of regulatory sequences and may have played an important role in the development of specific traits and behavior that distinguish closely related organisms from each other.
Nomenclature
The group of CONDELs of a specific organism is specified by prefixing the CONDELs with the first letter of the organism. For instance, hCONDELs refer to the group of CONDELs found in humans whereas mCONDELs and cCONDELs refer to mouse and chimpanzee CONDELs respectively.
Identification of CONDELs
The term hCONDEL was first used in the 2011 Nature article by McLean et al. in whole-genome comparison analysis. This involved firstly identifying a subset of 37,251 human deletions (hDELs) through pairwise comparisons of chimpanzee and macaque genomes. Chimpanzee sequences highly conserved in other species were then identified by pairwise alignment of chimpanzee with macaque, mouse and chicken sequences with BLASTZ followed by multiple alignment of the pairwise alignments done with MULTIZ. The highly conserved chimpanzee sequences were searched against the human genome using BLAT to identify conserved regions not present in humans. This identified 583 regions of deletions that were then referred to as hCONDELs. 510 of these identified hCONDELs were then validated computationally with 39 of these being validated by polymerase chain reaction (PCR).
Characteristics
hCONDELs in humans cover approximately 0.14% of chimpanzee genome. The number of hCONDELs currently identified is 583 using the genome-wide comparison method; however, validation of these predicated regions of deletions through polymerase chain reaction methods produces 510 hCONDELs. The remainder of these hCONDELs are either false-positives or non-existent genes. hCONDELs have been confirmed through PCR with 88 percent of these shown to have been lost from the draft Neanderthal genome. hCONDELs, on average, remove about 95 base pairs (bp) of highly conserved sequences from the human genome. The median size of these 510 validated CONDELs is about 2,804 bp, thus showing a diverse range in length of the characteristic deletions. Another noticeable characteristic of hCONDELs (and other groups of identified CONDELs such as those from mouse and chimpanzee) is that they tend to be specifically skewed towards GC poor regions. Simulations show that hCONDELs are enriched near genes involved in hormone receptor signaling and neural function, and near genes encoding fibronectin-type-III-or CD80-like immunoglobulin C2-set domains.
Impact in humans
= Sialic acid loss
=Of the 510 identified hCONDELs, only one of these deletions has been shown to remove a 92 bp sequence that is part of a protein-coding region in the human sequence. The deletion that affects the protein coding region in humans results in a frameshift mutation in the CMAH gene which codes for the cytidine monophosphate-N-acetylneurminic acid hydroxylase-like protein, an enzyme involved in the production of N-glycolylneuraminic acid, one type of sialic acid. Sialic acid is known to play a crucial part in cell signaling pathways and interaction processes. The loss of this gene is evident in the undetectable levels of sialic acid in humans but highly present in mouse, pig, chimpanzee and other mammal tissues and may provide more insight into the historic background of human evolution.
The mechanisms and time of occurrence of hCONDELs are not entirely understood but given that conserved non-coding sequences play a major developmental role through regulation of genes, their loss in regions of deletions, it is expected that their loss in hCONDELs will result in developmental consequences that can be observed in human-specific traits. In situ hybridization experiments done by Mclean et al. by fusion of mouse constructs fused to basal promoter with LacZ expression for hCONDELs near the androgen receptor (AR) locus and the growth arrest and DNA-damage-inducible protein GADD45 gamma (GADD45G) locus suggest a role in deletions that affect regulatory sequences in humans.
= Loss of whiskers and penile spine
=An hCONDEL located near the locus of the androgen receptor (AR) gene may be responsible for the loss of whiskers and penile spines in humans compared to its close relatives, including chimpanzees. The 60.7kb hCONDEL which is located near the AR locus has been found to be responsible for removing a 5 kb sequence that codes for an enhancer for the AR locus. Using the mouse construct with LacZ expression showed localization of this hCONDEL region (AR enhancer) to the mesenchyme of vibrissae follicles and the mesoderm cells of penile organs.
= Expansion of brain size
=Many hCONDELs are located around genes expressed during cortical neurogenesis. A 3,181 bp hCONDEL which is located near the GADD45G gene removes a forebrain-specific p300 enhancer binding site. The removal of this region, known to function as a suppressor, specifically increases the proliferation of the subventricular zone (SVZ) of the septum. The loss of this SVZ enhancer region in an hCONDEL may provide further insights into the role of DNA sequence changes that may have resulted in evolution of the human brain and may provide a better understanding of the evolution of humans.