K-mer based data structures and heuristics for microbes and cancer
MetadataShow full item record
Recent technological advances allow for high throughput profiling of biological systems in a cost-efficient manner. The low cost of data generation is leading us to the “big data” era and the availability of big data provides unprecedented opportunities but also raises new challenges for data mining and analysis. Machine learning algorithms have shown their power of increasing efficiency and accuracy in bioinformatics analysis but not all of these are open source. This dissertation presents a broad platform of open source tools to perform a variety of different genomic analyses and we include highlights such as un-supervised genomic clustering of microbes and supervised clustering of cancer patient drug response.