Finding causality in sociotechnical systems a Bayesian network structure learning approach to problem understanding
Martin, William Davies
MetadataShow full item record
Understanding the causal relations governing sociotechnical systems allows designers to better predict system dynamics, identify the root cause of issues, and subsequently design more sustainable systems. Visual representations can help develop this understanding and may be generated by a variety of approaches. This research explores the use of Bayesian Network (BN) structure learning algorithms to generate Directed Acyclic Graphs (DAGs) to represent system relationships. BNs are a particularly promising approach to visual representation because they convey information about dependencies and independencies in a system. BNs are often generated manually by experts, but many data driven BN learning algorithms have been developed that may aid non-experts in making decisions. These data driven approaches are beneficial when experts are not available, or designers want to avoid biases that experts might have. Nevertheless, most BN learning algorithms are designed for data that satisfies the Causal Sufficiency Assumption, Markov Assumption, and Faithfulness Assumption; all of which are not usually fully satisfied in real world sociotechnical data. This research aims to evaluate alternative methods for automating discovery of relationships between variables in observational datasets. The evaluation considers the assumptions made in BN learning methods about the observational data. This is accomplished by combining the methodologies of several independent researchers to characterize the assumptions valid in different datasets. Several kinds of BN structure learning algorithms and post processing techniques are then used to learn causal networks from the datasets. The performances of the learned networks are then compared by the network’s match with an expert network and ability to perform predictive inference. Comparing datasets with different valid assumptions showed that learning algorithms decreased in their ability to perform predictive inference and recover the expert specified structure as the number of valid assumptions decreased. All the algorithms performed similarly, and the post-processing approach did not improve results when all of the assumptions were valid in a simulated dataset. The results indicated that utilizing algorithms partially based on explicit tests of independence and using an Average BN post processing approach gave better performing networks for observational datasets that only partially satisfied the three assumptions. Finally, when none of the assumptions were proven valid, the algorithms and post-processing techniques all returned equally poor performing networks. Future work would assess more datasets and integrate human guidance with the algorithms to better define the guidelines for utilizing BN structure learning on the wide variety of datasets encountered in sociotechnical systems.