Short-Read Sequencing: Principle, Process, Examples, Uses microbiologystudy

Short-read sequencing is a widely used next-generation sequencing (NGS) method that generates and sequences short DNA fragments, usually between 50 and 300 base pairs at a time.

It is a cost-effective and rapid method of sequencing that was developed after Sanger sequencing. Short-read sequencing methods are also known as second-generation sequencing methods. It provides high efficiency and accuracy as shorter fragments are easier to generate, amplify, and sequence. 

Based on the read lengths, there are two types of sequencing methods: short-read and long-read sequencing. Read lengths are the numbers of base pairs sequenced or the average length of sequencing reads produced. Long-read sequencing platforms can sequence longer DNA strands and are ideal for whole genome sequencing and for analyzing complex genomic regions. However, they are costly, inaccurate, and time-consuming. On the other hand, short-read sequencing platforms are fast and cost-effective. This is suitable for targeted sequencing.

Principle of Short-Read Sequencing

The principle of short-read sequencing involves reading short DNA fragments. The process involves fragmentation of DNA into small pieces, attachment of adapters, amplification of templates, and sequencing using different short-read sequencing platforms to determine the nucleotide order of each fragment. The resulting sequencing data undergoes data analysis to analyze the genetic data and extract useful biological information. 

The three main sequencing principles used by short-read sequencing platforms are: sequencing by synthesis, sequencing by ligation, and sequencing by binding. While many new sequencing technologies have been developed, Illumina’s sequencing by synthesis is still the most commonly used short-read sequencing method.

  1. In the sequencing by synthesis (SBS) method, nucleotides are added to the growing DNA strand and each addition is detected based on fluorescence, pH changes, or other sensor-based methods. Illumina sequencing and Ion Torrent sequencing use this method.
  2. In sequencing by ligation (SBL), ligase enzymes are used instead of polymerase to identify sequences. Short fluorescently tagged oligonucleotides are introduced and the ligase joins the sequence matching the template strand. The fluorescent signal is used to detect the nucleotide base. SOLiD sequencing is an example of this type of short-read sequencing. 
  3. Sequencing by binding (SBB) is a newer method of short-read sequencing. This method replicates DNA in a two-step process. Fluorescently tagged nucleotides bind to the template strand but are not incorporated due to a reversible blocker. The signal is recorded and the nucleotide is washed away. The blocker is removed and an unlabeled nucleotide is added to extend the DNA. This method separates the binding and incorporation steps, using fluorescently tagged nucleotides for base identification without incorporating them into the DNA strand. This prevents errors caused by molecular scarring. PacBio short-read sequencing is an example of this method. 
Short-Read SequencingShort-Read Sequencing
Short-Read Sequencing

Process of Short-Read Sequencing

1. Library Preparation

  • The process begins with extracting DNA or RNA of interest and preparing them into libraries that are compatible with the sequencing device. 
  • RNA extraction involves an extra step where it is converted into complementary DNA (cDNA) by reverse transcription.
  • Then, fragmentation is done using different physical, enzymatic, or chemical methods to break the extracted nucleic acids into smaller pieces. 
  • The fragmented DNA or cDNA is repaired to create blunt ends and adapters are added to these fragments. These adapters are used to help the sequencing platform recognize the fragment. Adapters may also include barcodes to allow the sequencing of multiple samples together. 
  • Then, the fragments are sorted by size using bead-based or electrophoretic-based methods to remove unwanted contaminants and improve the accuracy of sequencing. 
  • Finally, the size selected library is amplified using PCR and prepared for sequencing.

2. Sequencing

  • Two most commonly used methods for short-read sequencing are Illumina and Ion Torrent. Both methods use sequencing by synthesis. 
  • Before sequencing, the DNA fragments are amplified. In Illumina platforms, clonal amplification is done using bridge amplification which creates millions of clusters of identical DNA fragments. 
  • In Ion Torrent, amplification is done using emulsion PCR, where DNA fragments are attached to beads and amplified in tiny droplets of water within an oil emulsion. 
  • After amplification is complete, sequencing begins. In the SBS method, nucleotides are sequentially added to the growing DNA strands and detected using sensors that measure fluorescence or pH change.

3. Data analysis

  • The data analysis step has three stages: primary, secondary, and tertiary analysis. 
  • Primary analysis includes base calling and quality control, where the raw sequencing data is initially processed and stored in FASTQ format.
  • In secondary analysis, the reads are aligned to a reference genome. Variant calling is also done to detect sequence variations. 
  • Finally, tertiary analysis focuses on the annotation and interpretation of variants to understand their biological importance. 

Short-Read Sequencing Examples

Some of the examples of short-read sequencing methods are:

  1. Illumina Sequencing is the most widely used short-read sequencing platform. This method uses the sequencing by synthesis method where the DNA fragments are initially amplified on a flow cell using bridge amplification. Then, fluorescently labeled bases are added one at a time to the sequencing template. Each nucleotide is detected by its fluorescent signal. These platforms produce high-throughput data with high accuracy.
  2. Ion Torrent Sequencing uses semiconductor technology for DNA sequencing. This method is also based on sequencing by synthesis. In this method, DNA fragments are amplified on beads using emulsion PCR, and the beads are placed in microwells on a semiconductor chip. When nucleotides are added during sequencing, protons are released which causes pH changes. This is detected by the semiconductor chip and is used to identify the nucleotide added. Unlike Illumina, Ion Torrent does not use fluorescence which makes it faster and less expensive but it has limitations in accurately sequencing homopolymer regions.
  3. SOLiD Sequencing uses a ligation-based method. At first, DNA libraries are prepared and amplified on beads using emulsion PCR. Then, sequencing occurs through ligation where fluorescently labeled di-base probes are ligated to the DNA fragments. This method uses a unique color space system to encode nucleotides. Each base is identified using the color code corresponding to the ligation. It provides high accuracy and high throughput but it is less commonly used compared to Illumina and Ion Torrent platforms.
  4. Onso Sequencing is a short-read sequencing platform developed by PacBio that uses sequencing by binding (SBB) technology. It provides very high accuracy and can detect rare genetic variants missed by other short-read sequencing methods. In this method, the base interrogation and incorporation steps are separate. Sequencing begins with a 3’ reversible blocked nucleotide. In each cycle, fluorescently labeled nucleotides bind to the DNA and their fluorescence is detected. Then, the reversible blocker is removed and native, unlabeled nucleotides are added for chain extension. This cycle is repeated for each base.

Advantages of Short-Read Sequencing

  • Short-read sequencing produces highly accurate sequencing results. This is ideal for detecting small changes in DNA. It can accurately sequence even low-quality DNA samples.
  • This method is fast and allows rapid sequencing of both DNA and RNA. This saves time and costs in projects that require quick results.
  • It is more affordable than traditional sequencing and long-read sequencing methods. The lower cost per base is suitable for large-scale projects.
  • Short-read sequencing is supported by widely used bioinformatics tools and pipelines. This simplifies data analysis and interpretation.
  • Short-read sequencing can produce large amounts of genomic data in a short time which is effective for research and clinical applications.

Limitations of Short-Read Sequencing

  • Short-read sequencing cannot sequence long fragments of DNA. Large DNA sequences must be broken into fragments, amplified, and computationally assembled which is challenging, especially in highly repetitive regions.
  • The amplification process can introduce errors or sequence biases.
  • Short-read sequencing struggles with sequencing complex genomic regions like repetitive sequences and regions rich in GC content. 
  • Large structural variations are also difficult to detect using short-read sequencing. 
  • Short-read sequencing can suffer from uneven coverage which can result in inconsistent data and inaccurate results especially when studying regions with low coverage.

Applications of Short-Read Sequencing

  • Short-read sequencing is used in whole-genome sequencing to sequence entire genomes. This helps in studying genome-wide variations. 
  • It is also useful in whole-exome sequencing. This helps to identify changes or variations in the genes associated with protein-coding sequences.
  • Short-read sequencing is also used in microbiome analysis to study the DNA sequences of microbial communities present in environmental or clinical samples. This helps to understand the diversity of microbes and their roles in different diseases.
  • It also has applications in RNA sequencing to study gene expression and to understand disease mechanisms.
  • The higher accuracy, low cost, and rapid sequencing of short reads is useful in clinical diagnostics. It can be used in disease diagnosis such as in cancer studies.
  • It is used in targeted sequencing and gene panel sequencing where specific regions of interest are sequenced. This is particularly useful in personalized medicine.

Combining Short-read and Long-read Sequencing Methods

  • Both short-read and long-read sequencing have their limitations. Combining these methods can help address the limitations of each method. 
  • Long-read sequencing is good at analyzing repetitive and complex regions of DNA but it has higher error rates.
  • On the other hand, short-read sequencing produces highly accurate and cost-effective data. 
  • When both methods are combined, short-read data can be used to correct the errors in long-read data, improving overall accuracy.

References

  1. Chauhan, T. (2024, February 9). What is Short-Read Sequencing? Retrieved from https://geneticeducation.co.in/what-is-short-read-sequencing/
  2. Hu, T., Chitnis, N., Monos, D., & Dinh, A. (2021). Next-generation sequencing technologies: An overview. Human Immunology, 82(11), 801–811. https://doi.org/10.1016/j.humimm.2021.02.012
  3. Mészáros, É. (2024). Short read vs long read sequencing. INTEGRA. Retrieved from https://www.integra-biosciences.com/united-states/en/blog/article/short-read-vs-long-read-sequencing
  4. Mobley, I. (2024, June 17). Long-read sequencing vs short-read sequencing – Front Line Genomics. Retrieved from https://frontlinegenomics.com/long-read-sequencing-vs-short-read-sequencing/
  5. PacBio. (2024, September 30). Onso sequencing system – PacBio. Retrieved from https://www.pacb.com/onso/
  6. PacBio (2023, August 8). Sequencing 101: SBB sequencing. Retrieved from https://www.pacb.com/blog/sbb-sequencing/
  7. Sambavince. (2021, June 2). NGS considerations: coverage, read length, multiplexing. Retrieved from https://irepertoire.com/ngs-considerations-coverage-read-length-multiplexing/
  8. seqWell. (2024, December 11). Short-Read Sequencing vs. Long-Read Sequencing: Which Technology is Right for Your Research? Retrieved from https://seqwell.com/short-read-sequencing-vs-long-read-sequencing-which-technology-is-right-for-your-research/
  9. Short-Read Sequencing vs. Long-Read Sequencing. (2025, January 5). Retrieved from https://zymoresearch.eu/blogs/blog/short-read-vs-long-read-sequencing
  10. What are the advantages of short-read sequencing technology? | AAT Bioquest. (n.d.). Retrieved from https://www.aatbio.com/resources/faq-frequently-asked-questions/what-are-the-advantages-of-short-read-sequencing-technology

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top