giaa034.pdf 10,18MB
1000 Titel
  • CSA: A high-throughput chromosome-scale assembly pipeline for vertebrate genomes
1000 Autor/in
  1. Kuhl, Heiner |
  2. Li, Ling |
  3. Wuertz, Sven |
  4. Stöck, Matthias |
  5. Liang, Xu-Fang |
  6. Christophe, KLOPP |
1000 Erscheinungsjahr 2020
1000 LeibnizOpen
1000 Publikationstyp
  1. Artikel |
1000 Online veröffentlicht
  • 2020-05-25
1000 Erschienen in
1000 Quellenangabe
  • 9(5):giaa034
1000 FRL-Sammlung
1000 Copyrightjahr
  • 2020
1000 Lizenz
1000 Verlagsversion
  • |
  • |
1000 Ergänzendes Material
  • |
1000 Publikationsstatus
1000 Begutachtungsstatus
1000 Sprache der Publikation
1000 Abstract/Summary
  • BACKGROUND: Easy-to-use and fast bioinformatics pipelines for long-read assembly that go beyond the contig level to generate highly continuous chromosome-scale genomes from raw data remain scarce. RESULT: Chromosome-Scale Assembler (CSA) is a novel computationally highly efficient bioinformatics pipeline that fills this gap. CSA integrates information from scaffolded assemblies (e.g., Hi-C or 10X Genomics) or even from diverged reference genomes into the assembly process. As CSA performs automated assembly of chromosome-sized scaffolds, we benchmark its performance against state-of-the-art reference genomes, i.e., conventionally built in a laborious fashion using multiple separate assembly tools and manual curation. CSA increases the contig lengths using scaffolding, local re-assembly, and gap closing. On certain datasets, initial contig N50 may be increased up to 4.5-fold. For smaller vertebrate genomes, chromosome-scale assemblies can be achieved within 12 h using low-cost, high-end desktop computers. Mammalian genomes can be processed within 16 h on compute-servers. Using diverged reference genomes for fish, birds, and mammals, we demonstrate that CSA calculates chromosome-scale assemblies from long-read data and genome comparisons alone. Even contig-level draft assemblies of diverged genomes are helpful for reconstructing chromosome-scale sequences. CSA is also capable of assembling ultra-long reads. CONCLUSIONS: CSA can speed up and simplify chromosome-level assembly and significantly lower costs of large-scale family-level vertebrate genome projects.
1000 Sacherschließung
lokal genome scaffolding
lokal vertebrates
lokal genome assembly
lokal genome evolution
lokal comparative genomics
lokal chromosomes
lokal long-read
1000 Fächerklassifikation (DDC)
1000 Liste der Beteiligten
1000 Label
1000 Förderer
  1. Deutsche Forschungsgemeinschaft |
  2. Leibniz-Gemeinschaft |
1000 Fördernummer
  1. KU 3596/1-1
  2. -
1000 Förderprogramm
  1. Reference genomes of the Chinese perch (Siniperca chuatsi), the Eurasian perch (Perca fluviatilis) and three related fish species of the family Sinipercidae for comparative genomics and marker assisted breeding in aquaculture; project number: 324050651
  2. Open Access Fund
1000 Dateien
1000 Förderung
  1. 1000 joinedFunding-child
    1000 Förderer Deutsche Forschungsgemeinschaft |
    1000 Förderprogramm Reference genomes of the Chinese perch (Siniperca chuatsi), the Eurasian perch (Perca fluviatilis) and three related fish species of the family Sinipercidae for comparative genomics and marker assisted breeding in aquaculture; project number: 324050651
    1000 Fördernummer KU 3596/1-1
  2. 1000 joinedFunding-child
    1000 Förderer Leibniz-Gemeinschaft |
    1000 Förderprogramm Open Access Fund
    1000 Fördernummer -
1000 Objektart article
1000 Beschrieben durch
1000 @id frl:6429125.rdf
1000 Erstellt am 2021-09-02T11:35:57.560+0200
1000 Erstellt von 218
1000 beschreibt frl:6429125
1000 Bearbeitet von 25
1000 Zuletzt bearbeitet Tue Sep 14 13:55:20 CEST 2021
1000 Objekt bearb. Tue Sep 14 13:55:00 CEST 2021
1000 Vgl. frl:6429125
1000 Oai Id
  1. |
1000 Sichtbarkeit Metadaten public
1000 Sichtbarkeit Daten public
1000 Gegenstand von

View source