Nucleic Acid Sequencing


The current radical changes covering literally all the fields of modern microbiology are stimulated greatly by the practical implementation of new technologies of molecular genetic analysis with special emphasis on nucleic acid sequencing.

Nucleic acid sequencing produces an absolute identification of microbial nucleic acids, and therefore, discovers the causative agents that reside in clinical samples.

Sequencing methods determine the direct order of nucleotides in nucleic acid chains. This clarifies the organization of genes within microbial genome and allows to deduce the structure of corresponding gene products.

Currently known technologies of nucleic acid sequencing demonstrate a tremendous progress in concern of their efficacy.

The group of so-called “first generation methods” comprise two classical techniques.

Maxam-Gilbert DNA sequencing is based on the treatment of studied DNA with several chemicals that cleave DNA molecule by position of certain nucleotide (C, T+C, G, and G+A). This action leads to the production of four numbers of DNA fragments of various lengths each finished by specific nucleotide.

Sanger (or dideoxy termination) method uses four types of fluorescently labeled dideoxy-nucleotides (ddNTP) that terminate DNA synthesis by DNA polymerase in position of definite nucleotide (A, T, G, or C).

When complementary DNA strand is synthesized, incorporation of ddNTP stops DNA elongation by position of corresponding nucleotide. As the result, four numbers of various lengths DNA fragments are created with specific ddNTP on their ends (similar to that of Maxam-Gilbert method).

After that, in both methods of sequencing 4 numbers of newly produced DNA fragments labeled with fluorescent or radioactive tags undergo gel electrophoresis. Four various mixtures of DNA fragments each bearing specific terminal nucleotide (A, T, G, or C) are separated by electrophoresis according to their fragment lengths running along 4 parallel lanes.

Finally, the comparison of positions of fluorescent DNA fragments within four parallel gel lanes allows to assemble primary sequence of investigated DNA.

A more convenient Sanger sequencing method was actively used for a long time in practical genetics. Its capillary version was applied for the first sequencing of full human genome in 2001.

However, the expanding efforts in full-genomic sequencing of the vast number of prokaryotic and eukaryotic genomes required the development of new high-throughput methods of massive parallel DNA sequencing. They were eventually termed as next-generation sequencing (NGS) or second generation sequencing methods.

There is an impressive variety of NGS methods, highly different by their chemistry and miniaturized technical platforms. They are organized as automated DNA sequencers. Most of them apply fluorescent labels and register fluorescent signals in sequencing process.

NGS methods comprise the following steps.

At first a large genetic library containing multiple copies of fragments of investigated DNA is created by PCR on solid or lipid phase reaction template.

After the dissociation of generated DNA copies into the single-stranded molecules the reaction of synthesis of a new double-stranded DNA molecules is performed (e.g., by PCR or ligase chain reaction).

The process of synthesis of complementary DNA strand is followed by consecutive attachment and incorporation of complementary nucleotide or probe into the sequence of growing DNA strand. Here every act of the attachment generates various fluorescent signals specific for the labels of all types of nucleotides (A, T, G, or C). These signals are registered by sensitive fluorescent detectors, and their order corresponds to the primary sequence of investigated nucleic acid.

NGS methods can analyse up to several billion of overlapping fragments of sequenced DNA (known as reads) per 1 run of the test.

The reads can be of various lengths – from 50 base pairs (bp) to 400-700 bp and even more. Every read is repeatedly analyzed from 8-10 times (known as deep sequencing) up to more than 100 times per run (ultra-deep sequencing mode that is used by the supreme sequencing methods).

All deep sequencing methods generate enormously large amounts of primary data. They are further analyzed by the methods of computer bioinformatics using highly sophisticated computer algorithms. Computer analysis performs the alignment of read sequences and constructs the most probable sequence of investigated nucleic acid.

As the result, full-genome analysis of certain microbial DNA isolated from clinical sample covers about several hours or days.

A great number of second generation sequencing methods are actively used now making genomic analysis fast and low-cost (pyrosequencing, Illumina and SOLiD platforms, and many others).

Currently emerging NGS technologies imply single DNA molecule sequencing. Sometimes they are termed as “third generation” methods.

For instance, single molecule real time sequencing (SMRT) uses microchip with thousands of nanoholes (waveguides) each of the volume about 20 zeptoliters (or 2*10-20 liters). Every such cell contains one molecule of single-stranded analyzed DNA, one molecule of DNA polymerase and all 4 types of nucleotides with fluorescent tag. Here every act of DNA strand elongation results in new fluorescent signal that is registered by detector, thus accumulating the information for reconstruction of DNA primary sequence.

SMRT technology made it possible to analyze seriously longer DNA reads (10,000-15,000 bp).

Likewise, if it is necessary to determine the sequence of long-length genomes, the procedure known as shotgun technique can be applied. In this method the DNA of interest is broken down into random smaller overlapping fragments thus making random fragment library. The fragments are next processed by automated DNA sequencers deciphering their nucleotide sequences. These overlapping fragments are further placed into the correct order by the computational methods of bioinformatics resulting in determination of the whole primary DNA sequence.

The “next-gen” sequencing technologies revolutionized the practice of modern microbiological laboratories. They created the opportunities for microbial transcriptome analysis, for investigation of individual variations of the large microbial communities (microbiomes), epigenetic regulations of microbial genomes, and evolutionary interplays between various microbial taxa.