Contributions to statistical learning and its applications in personalized medicine
Valencia Arboleda, Carlos Felipe
MetadataShow full item record
This dissertation, in general, is about finding stable solutions to statistical models with very large number of parameters and to analyze their asymptotic statistical properties. In particular, it is centered in the study of regularization methods based on penalized estimation. Those procedures find an estimator that is the result of an optimization problem balancing out the fitting to the data with the plausability of the estimation. The first chapter studies a smoothness regularization estimator for an infinite dimensional parameter in an exponential family model with functional predictors. We focused on the Reproducing Kernel Hilbert space approach and show that regardless the generality of the method, minimax optimal convergence rates are achieved. In order to derive the asymptotic analysis of the estimator, we developed a simultaneous diagonalization tool for two positive definite operators: the kernel operator and the operator defined by the second Frechet derivative of the expected data t functional. By using the proposed simultaneous diagonalization tool sharper bounds on the minimax rates are obtained. The second chapter studies the statistical properties of the method of regularization using Radial Basis Functions in the context of linear inverse problems. The regularization here serves two purposes, one is creating a stable solution for the inverse problem and the other is prevent the over-fitting on the nonparametric estimation of the functional target. Different degrees for the ill-posedness in the inversion of the operator A are considered: mildly and severely ill-posed. Also, we study different types for radial basis kernels classifieded by the strength of the penalization norm: Gaussian, Multiquadrics and Spline type of kernels. The third chapter deals with the problem of Individualized Treatment Rule (ITR) and analyzes the solution of it through Discriminant Analysis. In the ITR problem, the treatment assignment is done based on the particular patient's prognosis covariates in order to maximizes some reward function. Data generated from a random clinical trial is considered. Maximizing the empirical value function is an NP-hard computational problem. We consider estimating directly the decision rule by maximizing the expected value, using a surrogate function in order to make the optimization problem computationally feasible (convex programming). Necessary and sufficient conditions for Infinite Sample Consistency on the surrogate function are found for different scenarios: binary treatment selection, treatment selection with withholding and multi-treatment selection.