Computational algorithm development for epigenomic analysis
MetadataShow full item record
Multiple computational algorithms were developed for analyzing ChIP-seq datasets of histone modifications. For basic ChIP-seq data processing, the problems of ambiguous short sequence read mapping and broad peak calling of diffuse ChIP-seq signals were solved by novel statistical methods. Their performance was systematically evaluated compared with existing approaches. The potential utility of finding meaningful biological information was demonstrated by the applications on real datasets. For biological question driven data mining, several important topics were selected for algorithm developments, including hypothesis-driven insulator prediction, unbiased chromatin boundary element discovery and combinatorial histone modification signature inference. The integrative computational pipeline for insulator prediction not only produced a list of putative insulators but also recovered specific associated chromatin and functional features. Selected predictions have been experimentally validated. The unbiased chromatin boundary element prediction algorithm was feature-free and had the capability to discover novel types of boundary elements. The predictions found a set of chromatin features and provided the first report of tRNA-derived boundary elements in the human genome. The combinatorial chromatin signature algorithm employed chromatin profile alignments for unsupervised inferences of histone modification patterns. The signatures were associated with various regulatory elements and functional activities. Both the computational advantages and the biological discoveries were discussed.