Evaluating the Predictive Performance of Genomic Data-based Machine Learning Models for 4 Different Mental Health Disorders
Abstract
Clinical psychiatry can greatly benefit from using polygenic risk scores (PRS) to assess the risk of developing certain mental health disorders. While the PRS performance can be evaluated considering exclusively the disorder, we aim to leverage recent findings which state that mental health disorders may share genetic variants and have created features sets that are not only disorder-specific, but also encompass multiple mental health disorders. To evaluate the performance of these different features sets, we developed an automated polygenic risk score script that calculates the PRS of each patient in the UK Biobank, and a logistic regression script that utilizes a linear model to evaluate the performance. The predictive performance of schizophrenia and bipolar disorder showed significant improvement from the disorder-specific features set vs. the general ’Mental Health Disorders’ features set, suggesting that these two disorders may possess an overlapping polygenic architecture. This finding may help PRS become a robust tool used in clinical psychiatry to encourage earlier diagnosis of these disorders that greatly benefit from early treatment/intervention.