Functional assessments of amino acid variation in human genomes
MetadataShow full item record
The Human Genome Project, initiated in 1990, creates an enormous amount of excitement in human genetics—a field of study that seeks answers to the understanding of human evolution, diseases and development, gene therapy, and preventive medicine. The first completion of a human genome in 2003 and the breakthroughs of sequencing technologies in the past few years deliver the promised benefits of genome studies, especially in the roles of genomic variability and human health. However, intensive resource requirements and the associated costs make it infeasible to experimentally verify the effect of every genetic variation. At this stage of genome studies, in silico predictions play an important role in identifying putative functional variants. The most common practice for genome variant evaluation is based on the evolutionary conservation at the mutation site. Nonetheless, sequence conservation is not the absolute predictor for deleteriousness since phylogenetic diversity of aligned sequences used to construct the prediction algorithm has substantial effects on the analysis. This dissertation aims at overcoming the weaknesses of the conservation-based assumption for predicting the variant effects. The dissertation describes three different integrative computational approaches to identify a subset of high-priority amino acid mutations, derived from human genome data. The methods investigate variant-function relationships in three aspects of genome studies—personal genomics, genomics of epilepsy disorders, and genomics of variable drug responses. For genetic variants found in genomes of healthy individuals, an eight-level variant classification scheme is implemented to rank variants that are important towards individualized health profiles. For candidate genetic variants of epilepsy disorders, a novel 3-dimensional structure-based assessment protocol for amino acid mutations is established to improve discrimination between neutral and causal variants at less conserved sites, and to facilitate variant prioritization for experimental validations. For genomic variants that may affect inter-individual variability in drug responses, an explicit structure-based predictor for structural disturbances is developed to efficiently evaluate unknown variants in pharmacogenes. Overall, the three integrative approaches provide an opportunity for examining the effects of genomic variants from multiple perspectives of genome studies. They also introduce an efficient way to catalog amino acid variants on a large scale genome data.