Development and validation of 3-D cloud fields using data fusion and machine learning techniques
MetadataShow full item record
The impact of climate change is projected to significantly increase over the next decades. Consequently, gaining a better understanding of climate change and being able to accurately predict its effects are of the upmost importance. Climate change predictions are currently achieved using Global Climate Models (GCMs), which are complex representations of the major climate components and their interactions. However, these predictions present high levels of uncertainty, as illustrated by the very disparate results GCMs generate. According to the International Panel on Climate Change (IPCC), there is high confidence that such high levels of uncertainty are due to the way clouds are represented in climate models. Indeed, several cloud phenomena, such as the cloud-radiative forcing, are not well- modeled in GCMs because they rely on miscroscopic processes that, due to computational limitations, cannot be represented in GCMs. Such phenomena are instead represented through physically-motivated parameterizations, which lead to uncertainties in cloud representations. For these reasons, improving the parameterizations required for representing clouds in GCMs is a current focus of climate modeling research efforts. Integrating cloud satellite data into GCMs has been proved to be essential to the development and assessment of cloud radiative transfer parameterizations. Cloud-related data is captured by a variety of satellites, such as satellites from NASA’s afternoon constellation (also named the A-train), which collect vertical and horizontal data on the same orbital track. Data from the A-train has been useful to many studies on cloud prediction, but its coverage is limited. This is due to the fact that the sensors that collect vertical data have very narrow swaths, with a width as small as one kilometer. As a result, the area where vertical data exists is very limited, equivalent to a 1-kilometer-wide track. Thus, in order for satellite cloud data to be compared to global representations of clouds in GCMs, additional vertical cloud data has to be generated to provide a more global coverage. Consequently, the overall objective of this thesis is to support the validation of GCMs cloud representations through the generation of 3D cloud fields using cloud vertical data from space-borne sensors. This has already been attempted by several studies through the implementation of physics-based and similarity-based approaches. However, such studies have a number of limitations, such as the inability to handle large amounts of data and high resolutions, or the inability to account for diverse vertical profiles. Such limitations motivate the need for novel approaches in the generation of 3D cloud fields. For this purpose, efforts have been initiated at ASDL to develop an approach that leverages data fusion and machine learning techniques to generate 3D cloud field domains. Several successive ASDL-led efforts have helped shape this approach and overcome some of its challenges. In particular, these efforts have led to the development of a cloud predictive classification model that is based on decision trees and integrates atmospheric data to predict vertical cloud fraction. This model was evaluated against “on-track” cloud vertical data, and was found to have an acceptable performance. However, several limitations were identified in this model and the approach that led to it. First, its performance was lower when predicting lower-altitude clouds, and its overall performance could still be greatly improved. Second, the model had only been assessed at “on-track” locations, while the construction of data at “off-track” locations is necessary for generating 3D cloud fields. Last, the model had not been validated in the context of GCMs cloud representation, and no satisfactory level of model accuracy had been determined in this context. This work aims at overcoming these limitations by taking the following approach. The model obtained from previous efforts is improved by integrating additional, higher-accuracy data, by investigating the correlation within atmospheric predictors, and by implementing additional classification machine learning techniques, such as Random Forests. Then, the predictive model is performed at “off-track” locations, using predictors from NASA’s LAADS datasets. Horizontal validation of the computed profiles is performed against an existing dataset containing the Cloud Mask at the same locations. This leads to the generation of a coherent global 3D cloud fields dataset. Last, a methodology for validating this computed dataset in the context of GCMs cloud-radiative forcing representation is developed. The Fu-Liou code is implemented on sample vertical profiles from the computed dataset, and the output radiative fluxes are analyzed. This research significantly improves the model developed in previous efforts, as well validates the computed global dataset against existing data. Such validation demonstrates the potential of a machine learning-based approach to generate 3D cloud fields. Additionally, this research provides a benchmarked methodology to further validate this machine learning-based approach in the context of study. Altogether, this thesis contributes to NASA’s ongoing efforts towards improving GCMs and climate change predictions as a whole.