Gene synthesis was first demonstrated by Har Gobind Khorana in 1970 for a short artificial gene. Nowadays, commercial gene synthesis services are available from hundreds of companies worldwide, with a price often below $1 a base pair. Some expressed concern that such services could be used to create new strains of existing viruses/bacterias, or to resurrect extinct biological hazard organisms Gene synthesis is a method in molecular biology, comprising the the complete de novo production of structural genes, employing a combination of organic chemistry and molecular biology procedures, without having biological templates in hand. It has become an important tool in many fields of recombinant DNA technology including heterologous gene expression, vaccine development, gene therapy and molecular engineering. Frequently the synthesis of nucleic acid sequences is more economical than classical cloning and mutagenesis procedures.
While the ability to make increasingly long stretches of DNA efficiently and at lower prices is a technological driver of this field, increasingly attention is being focused on improving the design of genes for specific purposes. Early in the genome sequencing era, gene synthesis was used as an (expensive) source of cDNA's that were predicted by genomic or partial cDNA information but were difficult to clone. As higher quality sources of sequence verified cloned cDNA have become available, this practice has become less urgent. However, producing large amounts of protein from gene sequences (or at least the protein coding regions of genes, the open reading frame) found in nature can sometimes prove difficult. Many of the most interesting proteins sought by molecular biologist are normally regulated to be expressed in very low amounts in wild type cells. Redesigning these genes offers a means to improve gene expression in many cases. Rewriting the open reading frame is possible because of the redundancy of the genetic code. Thus it is possible to change up to about a third of the nucleotides in an open reading frame and still produce the same protein. The available number of alternate designs possible for a given protein is astronomical. For a typical protein sequence of 300 amino acids there are over 10150 codon combinations that will encode an identical protein. Using optimization methods such as replacing rarely used codons with more common codons can have a dramatic effects. Further optimizations such as removing RNA secondary structures can also be included. Computer programs to written to perform these and other simultaneous optimizations are used to handle the enormous complexity of the task. A well optimized gene can improve protein expression 2 to 10 fold, and in some cases more than 100 fold improvements have been reported. Because of the large numbers of nucleotide changes made to the original DNA sequence, the only practical way to create the newly designed genes is to use gene synthesis.
Oligonucleotides are chemically synthesized using nucleotides, called phosphoramidites, normal nucleotides which have protection groups: preventing amine, hydroxyl groups and phosphate groups interacting incorrectly. One phophoramidite is added at the time, the product's 5' phosphate is deprotected and a new base is added and so on (backwards), at the end, all the protection groups are removed. Nevertheless, being a chemical process, several incorrect interactions occur leading to some defective products. The longer the oligonucleotide sequence that is being synthesized, the more defects there are, thus this process is only practical for producing short sequences of nucleotides. HPLC can be used to isolate products with the proper sequence. Meanwhile a large number of oligos can be synthesized in parallel on gene chips. For optimal performance in subsequent gene synthesis procedures they should be prepared individually and in larger scales.
Usually, a set of individually designed oligonucleotides is made on automated solid-phase synthesizers, purified and then connected by specific annealing and standard ligation or polymerase reactions. To improve specificity of oligonucleotide annealing, the synthesis step relies on a set of thermostable DNA ligase and polymerase enzymes. To date, several methods for gene synthesis have been described, such as the ligation of phosphorylated overlapping oligonucleotides (1,2), the Fok I method (3) and a modified form of ligase chain reaction for gene synthesis. Additionally, several PCR assembly approaches have been described (4). They usually employ oligonucleotides of 40-50 nt long that overlap each other. These oligonucleotides are designed to cover most of the sequence of both strands, and the full-length molecule is generated progressively by overlap extension (OE) PCR (4), thermodynamically balanced inside-out (TBIO) PCR (5) or combined approaches (6).
Moreover, because the assembly of the full-length gene product relies on the efficient and specific alignment of long single stranded oligonucleotides, critical parameters for synthesis success include extended sequence regions comprising secondary structures caused by inverted repeats, extraordinary high or low GC-content, or repetitive structures. Usually these segments of a particular gene can only be synthesized by splitting the procedure into several consecutive steps and a final assembly of shorter sub-sequences, which in turn leads to a significant increase in time and labor needed for its production. The result of a gene synthesis experiment depends strongly on the quality of the oligonucleotides used. For these annealing based gene synthesis protocols, the quality of the product is directly and exponentially dependent on the correctness of the employed oligonucleotides. Alternatively, after performing gene synthesis with oligos of lower quality, more effort must be made in downstream quality assurance during clone analysis, which is usually done by time-consuming standard cloning and sequencing procedures. Another problem associated with all current gene synthesis methods is the high frequency of sequence errors because of the usage of chemically synthesized oligonucleotides. The error frequency increases with longer oligonucleotides, and as a consequence the percentage of correct product decreases dramatically as more oligonucleotides are used. The mutation problem could be solved by shorter oligonucleotides used to assemble the gene. However, all annealing based assembly methods require the primers to be mixed together in one tube. In this case, shorter overlaps do not always allow precise and specific annealing of complementary primers, resulting in the inhibiton of full length product formation. Manual design of oligonucleotides is a laborious procedure and does not guarantee the successful synthesis of the desired gene. For optimal performance of almost all annealing based methods, the melting temperatures of the overlapping regions are supposed to be similar for all oligonucleotides. The necessary primer optimization should be performed using specialized oligonucleotide design programs. Several solutions for automated primer design for gene synthesis have been presented so far (7, 8).
To overcome problems associated with oligonucleotide quality several elaborate strategies have been developed, employing either separately prepared fishing oligonucleotides (9), mismatch binding enzymes of the mutS family (10) or specific endonucleases from bacteria or phages (11). Nevertheless, all these strategies increase time and costs for gene synthesis based on the annealing of chemically synthesized oligonucleotides.
In contrast to conventional gene synthesis methods, Slonomics™ follows a different technological concept that completely eliminates the need for individually synthesised single-stranded oligonucleotides. Regardless of the gene being produced, the Slonomics™ process is based exclusively on a universal set of highly standardised raw materials (building blocks) and a series of recursive and standardised reaction steps (pipetting, mixing, incubation, washing), which can be processed in parallel for many gene constructs at a time. This enables the complete transfer of every working step to an industrial-style robotic platform where gene synthesis is performed at the level of 96 well microtiter plates. The Slonomics™ technology utilises an indexed library of universal double-stranded building blocks (called ‘anchors’ and ‘splinkers’). These pre-assembled oligonucleotides, which form a specific hairpin-like secondary structure, are processed with DNA modifying enzymes, such as restriction endonucleases or ligases. The building blocks are first utilised for generating small sub-fragments, which then are assembled to the full-length gene construct of choice. ‘Anchor’ molecules can be bound to a streptavidin-coated surface using a biotin residue, which is attached to the loop region of the molecules via a flexible spacer. The double stranded stem region separates into two sections. The constant sequence region assures the stability of the whole molecule and contains a recognition site for a type IIS outside cutter enzyme. It also serves as a carrier structure for the variable sequence part that is essential for building up new gene molecules. To allow for the synthesis of any desired gene sequence, the required number of different anchors is determined by 46 possible sequence combinations of a stretch of six nucleotides. In consequence, 4096 individual anchor molecules are stored in the building block library. ‘Splinkers’ resemble the soluble counterpart onto which the anchors eventually transfer their variable sequences. They also contain a recognition site for a type IIS outside cutter, which differs from the anchor sites. Providing only a variable 3 base single strand overhang, one the 64 possible splinker molecules always serves as the nucleus for a growing DNA chain when the Slonomics™ process is initiated. The synthesis process can be distinguished in two phases. During the initial ‘elongation’ phase, short sub-sequences of the target molecule are produced in parallel, through repetitive reaction cycles of ligation and restriction. In this process, the splinkers are elongated in steps of three base pairs. After five reaction cycles, an ‘elongation block’ with 15 independently definable base pairs is obtained. Since a multiple of this reaction can be performed in parallel, the elongation process already yields in the synthesis of the entire target sequence, provided as a series of short sub-fragments. In the subsequent ‘transposition step’ the individual ‘elongation blocks’ are assembled to form the desired gene construct through, again, repeated reaction cycles of ligation and restriction. By cutting one of the E-blocks on the anchor side and the other one on the splinker side, both truncated fragments can be combined and linked to each other on the solid phase. Since the resulting molecule still contains the constant anchor and splinker regions at its terminal ends, respectively, the reaction cycle can be easily repeated for several rounds, always leading to DNA molecules which have doubled in length. At different transposition levels, the resulting constructs can be harvested from the automated production platform and fed into a standardised downstream process for the final assembly and quality control of the synthetic gene construct. The Slonomics™ process can be considered superior to other methods for generating customised gene constructs. In comparison to the often quite fluctuating quality of long single stranded oligonucleotides, all universal building blocks in the Slonomics™ process consistently are of high quality. This is mainly due to their short length of 31 nucleotides and their overall identical structure, to which the applied phosphoamidite synthesis procedure can be optimally adjusted. The library oligonucleotides are produced all at once at large scale and, after having passed a thorough quality test, are stored in a freezer. Since each particular oligo will be used-up completely for synthesis reactions, no precious raw material is wasted. This approach does not only provide a significant cost reduction potential, it also avoids common problems caused by single stranded oligos being prone to the formation of alternative secondary structures. The concept of building genes entirely from double-stranded materials significantly facilitates the introduction of previously unmanageable gene sequences, like high GC-rich or highly repetitive motifs, into experimental schemes.
(2) Fuhrmann M, Oertel W, Hegemann P. A synthetic gene coding for the green fluorescent protein (GFP) is a versatile reporter in Chlamydomonas reinhardtii. Plant J. 1999 Aug;19(3):353-61. Click here to read
(3) Mandecki,W. and Bolling,T.J. (1988) FokI method of gene synthesis. Gene, 68, 101–107. Click here to read Abstract
(4) Stemmer,W.P., Crameri,A., Ha,K.D., Brennan,T.M. and Heyneker,H.L. (1995) Single-step assembly of a gene and entire plasmid from large numbers of oligodeoxyribonucleotides. Gene, 164, 49–53. Click here to read Abstract
(5) Gao X, Yo P, Keith A, Ragan TJ, Harris TK. Thermodynamically balanced inside-out (TBIO) PCR-based gene synthesis: a novel method of primer design for high-fidelity assembly of longer gene sequences. Nucleic Acids Res. 2003 Nov 15;31(22):e143. TBIO-PCR
(6) Young L, Dong Q. Two-step total gene synthesis method. Nucleic Acids Res. 2004 Apr 15;32(7):e59. Click here to read
(7) Hoover,D.M. and Lubkowski,J. (2002) DNAWorks: an automated method for designing oligonucleotides for PCR-based gene synthesis. Nucleic Acids Res., 30, e43. Click here to read
(8) Villalobos A, Ness JE, Gustafsson C, Minshull J, Govindarajan S. Gene Designer: a synthetic biology tool for constructing artificial DNA segments. BMC Bioinformatics. 2006 Jun 6;7:285. Click here to read
(9) Tian J, Gong H, Sheng N, Zhou X, Gulari E, Gao X, Church G. Accurate multiplex gene synthesis from programmable DNA microchips. Nature. 2004 Dec 23;432(7020):1050-4. Click here to read Abstract
(10) Carr PA, Park JS, Lee YJ, Yu T, Zhang S, Jacobson JM. Protein-mediated error correction for de novo DNA synthesis. Nucleic Acids Res. 2004 Nov 23;32(20):e162. Click here to read
(11) Fuhrmann M, Oertel W, Berthold P, Hegemann P. Removal of mismatched bases from synthetic genes by enzymatic mismatch cleavage. Nucleic Acids Res. 2005 Mar 30;33(6):e58. Click here to read