Information Theoretic Causality Measures For Parameter Estimation and System Identification
MetadataShow full item record
Constructing a model for a dynamic system from observed data is a complicated yet common problem for many engineered systems. This task, known as system identification, is a necessary step in many fields of engineering as it is often used for system modeling, simulation and control design. Inaccurate system models can lead to poor simulation results, which will lead to poor real world performance. While linear systems have a set of developed identification techniques, methods for nonlinear systems are not as generalizable or robust. Parameter estimation is a subset of system identification where a model structure is selected (either through first principles creating a grey box model or if a pre-prescribed structure is used for a black box model) with a set of parameters corresponding to the model needing to be optimized. In the case of linear systems, the solution to the parameter estimation problem is a closed-form least squares solution; however, parameters of nonlinear systems must be solved for numerically, which is subject to well-known issues of the solution converging to a local extrema. Maximum Likelihood Estimation (MLE) is commonly used to optimize the nonlinear system parameter set by minimizing the least-squares error between the actual data and the candidate optimized model. This optimization can often converge to local extrema, especially in the case noise, exogenous disturbance, or when a relatively small number of data points is available when compared to the dimension of the optimization problem. A problem known as overfitting can occur when a more complex model than needed is considered, as potentially multiple high accuracy unique model fits can be found over the available training data that generalizes poorly to unseen data as the optimized model no longer matches the generative dynamics. Methods to determine the optimal parameter set to be both accurate and predictive are critical to creation of a high fidelity model; these techniques are frequently referred to as covariate selection or feature selection techniques. The recently proposed Causation Entropy Matrix (CEM) allows for identification of causal information flow within a system. This is of immediate usefulness to many system identification tasks when the exact or entire structure of the system is unknown and covariate selection is needed. The CEM provides a method for pre-optimization, data-based covariate selection to allow for reduction of the number of parameters included in the system optimization to improve MLE results. This work provides background on the Causation Entropy Matrix and its computation before providing multiple examples of application of the CEM to grey-box and black-box modeling problems. The effectiveness of the Causation Entropy Matrix is then compared to the current state of the art techniques of LASSO (least absolute shrinkage and selection operator) and elastic net. Next, a chapter is dedicated to the practical considerations needed for application of the CEM to real-world systems including but not limited to noise, unmodeled dynamics, and sampling rate. This work concludes with a study of the application of the CEM to data experimentally collected from a physical, nonlinear system. The ability of the CEM to accurately identify the underlying structure of the generative dynamics demonstrating the method is a promising technique for nonlinear system identification and covariate selection.