Introns, derived from the term "intragenic regions" and also called intervening sequence (IVS), are DNA regions in a gene that are not translated into proteins. These non-coding sections are present in precursor mRNA (pre-mRNA) and some other RNAs, and removed by a process called splicing during the processing to mature RNA. After intron splicing, the mRNA consists only of exons, which are translated into a protein.
The number and length of introns varies widely among species, and among genes within the same species. Some eukaryotes, e.g. sac fungi, have evolved genomes with few introns, while the genomes of many other eukaryote groups are rich in introns (several per gene).
Alternative splicing of introns within a gene may introduce greater variability of protein sequences translated from a single gene. The control of mRNA splicing is performed by a wide variety of signaling molecules.
Introns may also contain "old code", or sections of a gene that were once translated into a protein, but have since become inactive. It was generally assumed that the sequence of any given intron is junk DNA with no biological function. More recently, however, this is being disputed.
Introns contain several short sequences that are important for efficient splicing, such as acceptor and donor sites at either end of the intron as well as a branch point site, which are required for proper splicing by the spliceosome.
"The notion of the cistron [...] must be replaced by that of a transcription unit containing regions which will be lost from the mature messenger - which I suggest we call introns (for intragenic regions) - alternating with regions which will be expressed - exons." (Gilbert 1978)
Four classes of introns are known to exist:
Some introns, such as the Group I and Group II introns, after transcription possess ribozyme activity, enabling them to catalyze their own splicing out of a primary RNA transcript. These introns are thus self splicing introns and are relatively rare compared to spliceosomal introns. This self-splicing activity was discovered by Thomas Cech, who shared the 1989 Nobel Prize in Chemistry with Sidney Altman for the discovery of the catalytic properties of RNA.
Nuclear or spliceosomal introns are spliced by the spliceosome and a series of snRNAs (small nuclear RNAs). There are certain splice signals (or consensus sequences) which abet the splicing (or identification) of these introns by the spliceosome.
Group II and III introns are similar and have a conserved secondary structure. A so-called lariat pathway is used in their splicing. They perform functions similar to the spliceosome and may be evolutionarily related to it. Group I introns are the only class of introns whose splicing requires a free guanine nucleoside. They possess a secondary structure different from that of group II and III introns. Many self-splicing introns code for maturases that help with the splicing process, generally only the splicing of the intron that encodes it.
There are two competing theories that offer alternative scenarios for the origin and early evolution of spliceosomal introns. Other classes of introns such as self-splicing and tRNA introns are not subject to much debate, but see for the former. These are popularly called as the Introns-Early (IE) or the Introns-Late (IL) views.
The IE model, championed by Walter Gilbert, proposes that introns are extremely old and numerously present in the earliest ancestors of prokaryotes and eukaryotes (the progenote). In this model introns were subsequently lost from prokaryotic organisms, allowing them to attain growth efficiency. A central prediction of this theory is that the early introns were mediators that facilitated the recombination of exons that represented the protein domains. This model cannot account for some observed positional variation of introns shared among related genes.
The IL model proposes that introns were more recently inserted into originally intron-less contiguous genes after the divergence of eukaryotes and prokaryotes. In this model, introns probably originated from transposable elements. This model is based on the observation that the spliceosomal introns are restricted to eukaryotes alone. However, there is considerable debate over the presence of introns in the early prokaryote-eukaryote ancestors and the subsequent intron loss-gain during eukaryotic evolution. The evolution of introns and of the intron-exon structure may be largely independent of the evolution of coding-sequences.
Nearly all eukaryotic nuclear introns begin with the nucleotide sequence GU, and end with AG (the GU-AG rule). These, along with a larger consensus sequence, help direct the splicing machinery to the proper intronic donor and acceptor sites.