A single nucleotide polymorphism
, pronounced snip
) is a DNA sequence
variation occurring when a single nucleotide
, or G
- in the genome
(or other shared sequence) differs between members of a species (or between paired chromosomes in an individual). For example, two sequenced DNA fragments from different individuals, AAGCC
TA to AAGCT
TA, contain a difference in a single nucleotide. In this case we say that there are two alleles
: C and T. Almost all common SNPs have only two alleles.
Within a population, SNPs can be assigned a minor allele frequency — the lowest allele frequency at a locus that is observed in a particular population.
This is simply the lesser of the two allele frequencies for single nucleotide polymorphisms.
It is important to note that there are variations between human populations, so a SNP allele that is common in one geographical or ethnic group may be much rarer in another.
In the past, SNPs with a minor allele frequency of less than or equal to 1% (or 0.5%, etc.) were given the title "SNP",
an unwieldy definition.
Some used "mutation
" to refer to variations with low allele frequency.
With the advent of modern bioinformatics
and a better understanding of evolution, this definition is no longer necessary, e.g., a database such as dbSNP
includes "SNPs" that have lower allele frequency than one percent.
Types of SNPs
| Types of SNPs |
- Non-coding region
- Coding region
Single nucleotide polymorphisms
may fall within coding sequences of genes
, non-coding regions of genes
, or in the intergenic regions
SNPs within a coding sequence will not necessarily change the amino acid
sequence of the protein
that is produced, due to degeneracy of the genetic code
A SNP in which both forms lead to the same polypeptide sequence is termed synonymous
(sometimes called a silent mutation
) - if a different polypeptide sequence is produced they are nonsynonymous
A nonsynonymous change may either be missense
", where a missense change results in a different amino acid, while a nonsense change results in a premature stop codon
SNPs that are not in protein-coding regions may still have consequences for gene splicing
, transcription factor
binding, or the sequence of non-coding RNA
Use and importance of SNPs
Variations in the DNA sequences of humans can affect how humans develop diseases
and respond to pathogens
, and other agents. SNPs are also thought to be key enablers in realizing the concept of personalized medicine
. However, their greatest importance in biomedical research is for comparing regions of the genome between cohorts
(such as with matched cohorts with and without a disease).
The study of single nucleotide polymorphisms is also important in crop and livestock breeding programs (see genotyping). See SNP genotyping for details on the various methods used to identify SNPs.
Example SNPs are rs6311
in the HTR2A
A SNP in the F5
gene causes a hypercoagulability disorder with the variant Factor V Leiden
An example of a triallelic SNP is rs3091244
As there are for genes there are also bioinformatics
databases for SNPs.
is a SNP database from National Center for Biotechnology Information
is a wiki-style database from a private company.
database describes the association between polymorphisms and, e.g., diseases.
The nomenclature for SNPs can be confusing: several variations can exist for an individual SNP and consensus has not yet been achieved. One approach is to write SNPs with a prefix, period and greater than sign showing the wild-type and altered nucleotide or amino acid; for example, c.76A>T.