Rapid Genome Assembly
Cost-effective, high-quality genome assembly methods
Status: Published in G3: Genes, Genomes, Genetics (2018) First Authors: Edwin A. Solares & Mahul Chakraborty (co-first authors) Collaborators: Danny E. Miller, Shannon Kalsow, Kate Hall, Anoja G. Perera, J.J. Emerson, R. Scott Hawley
Overview
We developed and validated a cost-effective approach for assembling high-quality reference genomes using low-coverage, long-read sequencing. This methodological innovation democratized genomic research by making genome assembly accessible to labs with limited budgets.
The Problem
Traditional genome assembly approaches faced critical limitations:
- High cost: $10,000-50,000 per genome
- Complexity: Required extensive computational expertise
- Time: Months of bioinformatics work per genome
- Inaccessibility: Only well-funded labs could assemble multiple genomes
Our Approach
We demonstrated that combining:
- Low-coverage long-read sequencing (PacBio ~20-30X coverage)
- Strategic use of reference genomes from closely related species
- Optimized assembly algorithms
Could produce reference-quality genomes at <10% of traditional costs.
Key Findings
Assembly Quality
- Achieved reference-quality contiguity (N50 >1 Mb)
- >98% gene completeness (BUSCO assessment)
- Accurate representation of structural variants
Cost-Benefit Analysis
- 90% cost reduction compared to traditional methods
- 10X faster assembly pipeline
- Total cost: <$2,000 per genome (vs. $20,000+ traditional)
Impact
- Cited 75+ times in genome assembly literature
- Method adopted by dozens of labs worldwide
- Enabled research in diverse organisms (insects, plants, fungi)
- Contributed to “genomics democratization” movement
Publication
Solares, E.A., Chakraborty, M., Miller, D.E., et al. (2018). “Rapid Low-Cost Assembly of the Drosophila melanogaster Reference Genome Using Low-Coverage, Long-Read Sequencing.” G3: Genes, Genomes, Genetics, 8(10), 3143-3154.