Assembly of Big Genomic Data
MetadataShow full item record
As genome sequencing technologies continue to facilitate the generation of large datasets, developing scalable algorithms has come to the forefront as a crucial step in analyzing these datasets. In this talk, I will discuss several recent advances, with a focus on the problem of reconstructing a genome from a set of reads (genome assembly). I will describe low-memory and scalable algorithms for automatic parameter selection and de Bruijn graph compaction, recently implemented in two tools: KmerGenie and bcalm. I will also present recent advances in the theoretical foundations of genome assemblers.