Rapid Genome Assembly

Cost-effective, high-quality genome assembly methods

Status: Published in G3: Genes, Genomes, Genetics (2018) First Authors: Edwin A. Solares & Mahul Chakraborty (co-first authors) Collaborators: Danny E. Miller, Shannon Kalsow, Kate Hall, Anoja G. Perera, J.J. Emerson, R. Scott Hawley


Overview

We developed and validated a cost-effective approach for assembling high-quality reference genomes using low-coverage, long-read sequencing. This methodological innovation democratized genomic research by making genome assembly accessible to labs with limited budgets.

The Problem

Traditional genome assembly approaches faced critical limitations:

  • High cost: $10,000-50,000 per genome
  • Complexity: Required extensive computational expertise
  • Time: Months of bioinformatics work per genome
  • Inaccessibility: Only well-funded labs could assemble multiple genomes

Our Approach

We demonstrated that combining:

  • Low-coverage long-read sequencing (PacBio ~20-30X coverage)
  • Strategic use of reference genomes from closely related species
  • Optimized assembly algorithms

Could produce reference-quality genomes at <10% of traditional costs.

Key Findings

Assembly Quality

  • Achieved reference-quality contiguity (N50 >1 Mb)
  • >98% gene completeness (BUSCO assessment)
  • Accurate representation of structural variants

Cost-Benefit Analysis

  • 90% cost reduction compared to traditional methods
  • 10X faster assembly pipeline
  • Total cost: <$2,000 per genome (vs. $20,000+ traditional)

Impact

  • Cited 75+ times in genome assembly literature
  • Method adopted by dozens of labs worldwide
  • Enabled research in diverse organisms (insects, plants, fungi)
  • Contributed to “genomics democratization” movement

Publication

Solares, E.A., Chakraborty, M., Miller, D.E., et al. (2018). “Rapid Low-Cost Assembly of the Drosophila melanogaster Reference Genome Using Low-Coverage, Long-Read Sequencing.” G3: Genes, Genomes, Genetics, 8(10), 3143-3154.

References