Whole exome sequencing captures only the exons in a genome, while whole genome sequencing capture nucleotides at the genomic level. Given that the exons only account for 1.5% of a genome size, it suggests that whole genome sequencing can capture more sequence data, not just coding regions. Additionally, mutations in noncoding regions may associate with diseases, such as cancer, which leads more people to the studies of noncoding regions.
Sequencing depth is an important factor for high-throughput sequencing. Kyung Kim, et al (2015) revealed that the sequencing depth of whole exome sequencing can affect the discovery rates of variations. To summarize, the number of deleterious SNPs and InDels detected in the coding regions was only weakly increased a depths more than 120×. In other words, a sequencing depth of 120× can be considered reasonable when using the exome capture sequencing technique to identify significant variations in diagnostic studies.
Below are some generalized recommendations for short-read whole genome sequencing using Illumina platforms:
The answer is highly dependent on many factors, including the goals of your experiment, the organism of interest, and the amount of material available.
Most resequencing applications are well suited for Illumina platforms. Illumina sequencing produces large numbers of short reads (<300 bp) at high quality allowing for robust variant detection. Illumina datasets are widely applied to genomic studies of viruses, bacteria, mammals, and plants with a wide range of input materials including low input DNA, FFPE samples, and circulating cell free DNA.
However, certain analyses are improved by very long sequencing reads (several kilobases) generated by the PacBio platform. These long reads are able to span large repeat regions, enabling near complete de novo assembly of small to medium sized genomes, improved assembly of large complex genomes (e.g. plants and animals), and more sensitive detection of large structural variants. Long-read sequencing is ideal for complex genomes that could be highly repetitive in nature and/or have GC extremes.
Resequencing is typically performed when a reference genome sequence is available. Sequencing reads are aligned back to the reference to determine the location in the genome the specific read best matches. Resequencing is often applied to variant detection (single nucleotide polymorphisms, small insertions/deletions, structural variants, copy number variation) and derivatives thereof—such as tumor vs normal comparison, population genetic analysis, Mendelian disease analysis, and trio sequencing.
De novo sequencing and assembly is typically applied to organisms where no reference genome is available or the available reference is of poor quality. Genomes that have not been sequenced before must be assembled via a de novo approach following sequencing. This assembly can then be used for additional analyses and the basis for future resequencing projects.
Coverage is a multiplier based on the total size of the genome (see below). For humans, 30x coverage can be achieved with 600 million reads of 150 bp (or 300M paired-end reads).
Coverage = (read length) x (total number of reads) / (genome size)
Example: 30x = (150 bp/read) x (600 x 106 reads) / (3 x 109 bp)
View our Sample Submission Guidelines for instructions on preparing and sending your samples for NGS. Ship samples directly to our facility.
Navigate to the contact us page. ONEOMICS NGS Team is composed of Ph.D. scientists who can help you optimize your project design and provide consultation.
Copyright © 2020-2022 ONEOMICS - All Rights Reserved.