Paired-End Sequencing: Principle, Steps, Uses microbiologystudy

Paired-end sequencing is a next-generation sequencing method that sequences a DNA fragment from both ends. This method produces two reads for each fragment which makes the sequencing process more accurate.

There are two types of sequencing reads: single-read and paired-end read. While single-read sequencing reads from only one side of a DNA fragment, paired-end sequencing allows us to read from both ends. Single-read sequencing is cost-effective and fast however it cannot accurately sequence longer DNA fragments and repetitive sequences.

Paired-end sequencing solves these problems by reading both ends of a DNA sequence. It provides higher sequence coverage and better accuracy when aligning the reads to the genome, so it can be used to accurately detect structural variants in the genome. It also gives better results when studying complex and repetitive genomic regions. Paired-end sequencing is most commonly performed on Illumina sequencing platforms.

Principle of Paired-End Sequencing

The principle of paired-end sequencing involves sequencing DNA fragments from both ends in opposite directions. This generates paired-end reads that offer more context than single-read sequencing. Paired-end sequencing uses the principle of sequencing by synthesis (SBS) where each nucleotide is added and read simultaneously using a fluorescent tag.

Paired-end sequencing starts by preparing a DNA library with different binding sites for sequencing primers at both ends. Unique barcodes or index sequences are also added to both ends of the fragments. After library preparation, identical clusters of DNA fragments are generated on the sequencing flow cell using the bridge amplification process. 

Sequencing is performed in both directions using the sequencing by synthesis method where fluorescently labeled nucleotides read the DNA base by base. Sequencing begins by reading the forward strand of the DNA. This produces the forward reads. Then the unique index tags are also read to identify which fragment the read belongs to. After sequencing the forward strands, the process is repeated for the opposite end. The template is flipped by repeating the bridge amplification step and the reverse strand is sequenced in the same way. Finally, the forward and reverse reads are processed and aligned to a reference genome for data analysis.

Paired-End SequencingPaired-End Sequencing
Paired-End Sequencing

Types of Paired-End Libraries

Different types of paired-end libraries are used for different sequencing needs. Three main types of paired-end libraries are:

  1. Simple Paired-End Library: This is the simplest type of paired-end sequencing library which involves fragmenting genomic DNA into smaller pieces and ligating adapter sequences to both ends of the fragments. These fragments are amplified and prepared for sequencing. It usually consists of shorter DNA fragments.
  2. Mate-Pair Library: These libraries are prepared for sequencing longer DNA fragments. This is particularly useful for studying large genomic regions. In this process, long DNA fragments are biotinylated and circularized to create a loop of DNA. These circularized DNA fragments are then broken into smaller pieces and enriched to select only biotin-tagged fragments. The resulting fragments are sequenced to generate mate pairs which is useful for large-scale genomic projects.
  3. Paired-End Tag (PET) Library: This involves using specific restriction enzymes to generate short DNA tags corresponding to each end of the DNA fragment. These libraries are useful for studying specific genomic regions or for applications such as chromatin immunoprecipitation sequencing (ChIP-Seq) and other targeted sequencing studies. 

Process/Steps of Paired-End Sequencing

1. DNA/RNA Extraction and Library Preparation

  • This step involves preparing the DNA or RNA of interest for sequencing. 
  • Library preparation for paired-end sequencing starts by fragmenting purified genomic DNA or RNA into small fragments and ligating different adapter sequences to both ends of the fragments. 
  • These adapters help the fragments bind to the flow cell in the sequencing device and also have primer binding sites. 
  • Index sequences are also added to these fragments which help in accurate alignment during data analysis. 
  • The adapter-ligated fragments are purified and size-selected using gel electrophoresis. 
  • Then these fragments are denatured to single-stranded DNA and loaded onto the flow cell in the sequencing platform.

2. Cluster Generation

  • The sequencing flow cell contains lanes with two types of oligonucleotide sequences used for hybridization.
  • First, the single-stranded DNA fragments are hybridized to the first type of oligonucleotides on the flow cell. Then, complementary strands are synthesized by polymerase and the original template strand is removed.
  • The remaining strands are amplified clonally using bridge amplification. During bridge amplification, the DNA fragments bend and hybridize to the second type of oligos on the flow cell. This forms a bridge. 
  • These strands are amplified to create double-stranded bridges which are then denatured to produce two single-stranded DNA molecules that are attached to the flow cell. This cycle is repeated to generate millions of identical DNA clusters. 
  • These clusters contain a mixture of forward and reverse strands. Reverse strands are removed from the flow cell and only forward strands are kept for sequencing.

3. Sequencing

  • The sequencing process begins by adding the primer to the forward strands. Then, fluorescently labeled nucleotide bases are added one at a time and the addition is detected with the emission of fluorescent signals. 
  • After sequencing the forward strands, the index sequences are also sequenced to identify the origin of each read. 
  • Then the forward strand on the flow cell is used to regenerate the reverse strand. The same templates are flipped into the opposite orientation by repeating the bridge amplification process. In this round, the forward strand is removed and the reverse strand is left for sequencing. Primer for the reverse strand is added and reverse strands are sequenced like the forward strand. 
  • After sequencing both ends, paired-end reads are generated. 

4. Data Analysis

  • Then, the sequencing data is processed and analyzed. Paired-end reads contain sequence information for both ends of each DNA fragment. 
  • Two FASTQ files are generated from the data. The first contains forward sequence data and overlap information while the second contains reverse sequence data. 
  • These paired-end reads are combined into contiguous sequences and aligned to a reference genome or reassembled de novo for organisms without a reference. 
  • The data is analyzed to find genetic variations and extract detailed biological information.

Advantages of Paired-End Sequencing

  • Paired-end sequencing provides more accurate results because of more sequence coverage.
  • It provides better alignment, especially in complex and repetitive regions. 
  • It helps detect structural variations and changes in the DNA like insertions, deletions, mutations, or rearrangements.
  • Paired-end sequencing produces reads from both ends which improves the quality of alignment.
  • It generates more data by providing two reads for every DNA fragment.

Limitations of Paired-End Sequencing

  • Paired-end sequencing is expensive because it involves additional library preparation steps and computational resources for data analysis.
  • Data analysis is complex and needs advanced bioinformatics tools. This also requires more storage capacity to handle the increased data volume.
  • Paired-end sequencing takes more time.
  • The library preparation step in paired-end sequencing is complex and requires several steps.
  • Long-insert paired-end libraries can have issues like chimeric reads which can make data analysis more difficult.

Applications of Paired-End Sequencing

  • Paired-end sequencing is useful for studying large or repetitive DNA regions by linking paired reads. 
  • It provides better alignment and fills gaps in genome assembly which is also useful for de novo sequencing.
  • It is useful for detecting structural variations that are hard to detect with other sequencing methods. 
  • It can be used to detect gene fusions which occur when parts of two different genes are joined together. This is common in cancer and several other diseases. Paired-end reads can precisely identify breakpoint regions and fusion junctions.
  • Paired-end sequencing has applications in RNA sequencing. It helps map RNA sequences and identify alternative splicing events. This is useful for understanding gene expression.
  • It helps in detecting epigenetic modifications by mapping reads to specific genomic regions. It can be used to study histone modifications and DNA methylation which are important for understanding different diseases.
  • It is also used in metagenomics studies to study microbial communities in complex samples.

Paired-End Sequencing vs. Single-Read Sequencing

Characteristics Paired-End Sequencing Single-Read Sequencing
Sequencing Method It sequences a DNA fragment from both ends. It sequences DNA fragments from one end only, usually from 5’ to 3’ direction.
Read Output It produces two separate reads for each fragment. It generates a single read. 
Cost It is more expensive due to additional sequencing and library preparation steps. It is cost-effective as it requires fewer reagents and involves fewer steps.
Complexity It is more complex but provides higher-quality data. It is a simpler and less complex method.
Data Quality It generates more accurate and high-quality data, especially in repetitive or complex regions. It provides good quality for basic studies but has lower accuracy in repetitive regions. 
Bioinformatics It requires more advanced bioinformatics tools and more computational resources. It is easier to process with simpler computational requirements.
Applications It is suitable for detecting structural variants, gene fusions, and complex regions. It is best for specific applications like small RNA-Seq and ChIP-Seq or basic transcriptomics.

References

  1. Chauhan, T. (2024, January 22). What is Paired-End Sequencing and Why is it Better? Retrieved from https://geneticeducation.co.in/paired-end-sequencing/
  2. Chauhan, T. (2024, January 23). Single-End vs Paired-End Sequencing. Retrieved from https://geneticeducation.co.in/single-end-vs-paired-end-sequencing/
  3. Paired-End vs. Single-Read Sequencing Technology. (n.d.). Retrieved from https://sapac.illumina.com/science/technology/next-generation-sequencing/plan-experiments/paired-end-vs-single-read.html
  4. Risca, V. I., & Greenleaf, W. J. (2015). Beyond the linear genome: Paired-End sequencing as a biophysical tool. Trends in Cell Biology, 25(12), 716–719. https://doi.org/10.1016/j.tcb.2015.08.004
  5. Sequencing Read Length | How to calculate NGS read length. (n.d.). Retrieved from https://www.illumina.com/science/technology/next-generation-sequencing/plan-experiments/read-length.html
  6. Single-read vs. Paired-end Sequencing – CD Genomics. (n.d.). Retrieved from https://www.cd-genomics.com/resource-single-read-vs-paired-end-sequencing.html
  7. Son, M. S., & Taylor, R. K. (2011). Preparing DNA libraries for Multiplexed Paired‐End Deep Sequencing for Illumina GA sequencers. Current Protocols in Microbiology, 20(1). https://doi.org/10.1002/9780471729259.mc01e04s20
  8. What is mate pair sequencing for? Retrieved from https://www.ecseq.com/support/ngs/what-is-mate-pair-sequencing-useful-for

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top