Most reported inteins also contain an endonuclease domain that plays a role in intein propagation. In fact, many genes have unrelated intein-coding segments inserted at different positions. For these and other reasons, inteins (or more properly, the gene segments coding for inteins) are sometimes called selfish genetic elements but it may be more accurate to call them parasitic. The difference is that "selfish genes" are "selfish" only insofar as to compete with other genes or alleles, but still fulfill a beneficial function for the organism as a whole, whereas "parasitic genes" are functionless.
Intein-mediated protein splicing occurs after mRNA has been translated into a protein. This precursor protein contains three segments - an N-extein followed by the intein followed by a C-extein. After splicing has taken place, the result is also called an extein.
The first intein was discovered in 1987. Since then, inteins have been found in all three domains of life (eukaryotes, bacteria, and archaea). Knowledge regarding the evolutionary situation of inteins and related elements is reviewed in Gogarten & Hilario (2006). The mechanism for the splicing effect is a naturally occurring analogy to the technique for chemically generating medium-sized proteins called native chemical ligation, which was developed at the same time as inteins were discovered.
Pharmaceutical inhibition of intein excision may be a useful tool for drug development, the protein that contains the intein will not carry out its normal function if the intein does not excise since its structure will be disrupted.
It has been suggested that inteins could prove useful for achieving allotopic expression of certain highly hydrophobic proteins normally encoded by the mitochondrial genome, for example in gene therapy (de Grey 2000). The hydrophobicity of these proteins is an obstacle to their import into mitochondria. Therefore, the insertion of a non-hydrophobic intein may allow this import to proceed. Excision of the intein after import would then restore the protein to wild-type.
Normally, as in this example, just three letters suffice to specify the organism, but there are variations. For example, additional letters may be added to indicate a strain. If more than one intein is encoded in the corresponding gene, the inteins are given a numerical suffix starting from 5' to 3' or in order of their identification. For example, "Msm dnaB-1".
The segment of the gene that encodes the intein is usually given the same name as the intein, but to avoid confusion, the name of the intein proper is usually capitalized (e.g. Pfu RIR1-1), whereas the name of the corresponding gene segment is italicized.