Skip to main content
Figure 1 | Microbial Informatics and Experimentation

Figure 1

From: Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions

Figure 1

Flowchart detailing pipeline steps required for deep sequencing projects. After extracting genomic material, PCR amplification may be required prior to library preparation. For sequencing of a target region (‘amplicon sequencing’), multiple, ‘nested’ PCR rounds may be performed. Sequencing adapters and primers may be included in the primer for the final round, or may be annealed to the ends of fragments after amplification. For whole genome sequencing, multiple, overlapping PCR products are randomly sheared before annealing of sequencing adapters and primers. Alternatively, if sufficient genomic material is available, shearing and annealing may be performed directly without PCR amplification. If sequencing RNA, RT must be performed before library preparation. For amplicon sequencing, this may take the form of an initial RT-PCR. Choice of sequencing technology is dependent on the project’s aims: for instance, the longer reads of Roche-454 may be more appropriate for reconstructing haplotypes, while the high data volume afforded by Illumina is more suitable for detecting very low frequency SNVs. After sequencing, reads must be aligned, either via multiple sequence alignment or to a reference. Choice of reference is critical; if available, a published reference or references may be used; alternatively, a consensus sequence may be used, generated through de novo assembly, or by alignment to a published reference followed by replacement of fixed variants, or by Sanger sequencing the same sample as submitted for deep sequencing. Following alignment, a number of bioinformatic tools are available for SNV calling, haplotype reconstruction, and downstream analysis.

Back to article page