Biomarker discovery and clinical outcome prediction using knowledge based-bioinformatics
Phan, John H.
MetadataShow full item record
Advances in high-throughput genomic and proteomic technology have led to a growing interest in cancer biomarkers. These biomarkers can potentially improve the accuracy of cancer subtype prediction and subsequently, the success of therapy. However, identification of statistically and biologically relevant biomarkers from high-throughput data can be unreliable due to the nature of the data--e.g., high technical variability, small sample size, and high dimension size. Due to the lack of available training samples, data-driven machine learning methods are often insufficient without the support of knowledge-based algorithms. We research and investigate the benefits of using knowledge-based algorithms to solve clinical prediction problems. Because we are interested in identifying biomarkers that are also feasible in clinical prediction models, we focus on two analytical components: feature selection and predictive model selection. In addition to data variance, we must also consider the variance of analytical methods. There are many existing feature selection algorithms, each of which may produce different results. Moreover, it is not trivial to identify model parameters that maximize the sensitivity and specificity of clinical prediction. Thus, we introduce a method that uses independently validated biological knowledge to reduce the space of relevant feature selection algorithms and to improve the reliability of clinical predictors. Finally, we implement several functions of this knowledge-based method as a web-based, user-friendly, and standards-compatible software application.