Reasoning about programs in statistically modeled first-order environments
MetadataShow full item record
The objects of study in this dissertation are programs and algorithms that reason about programs using their syntactic structure. Such algorithms, referred to as program verification algorithms in the literature, are designed to find proofs of propositions about program behavior. This dissertation adopts the perspective that programs operate in environments that can be modeled statistically. In other words, program inputs are samples drawn from a generative statistical model. This statistical perspective has two main advantages. First, it allows us to reason about programs that are not expected to exhibit the desired behavior on all program inputs, such as neural networks that are learnt from data, by formulating and proving probabilistic propositions about program behavior. Second, it enables us to simplify the search for proofs of non-probabilistic propositions about program behavior by designing program verification algorithms that are capable of inferring “likely” hypotheses about the program environment. The first contribution of this dissertation is a pair of program verification algorithms for finding proofs of probabilistic robustness of neural networks. A trained neural network f is probabilistically robust if, for a pair of inputs that is randomly generated as per the environment statistical model, f is likely to demonstrate k-Lipschitzness, i.e., the distance between the outputs computed by f is upper-bounded by the kth multiple of the distance between the pair of inputs. A proof of probabilistic robustness guarantees that the neural network is unlikely to exhibit divergent behaviors on similar inputs. The second contribution of this dissertation is a generic algorithmic framework, referred to as observational abstract interpreters, for designing algorithms that compute hypothetical semantic program invariants. Semantic invariants are logical predicates about program behavior and are used in program proofs as lemmas. The well-studied algorithmic framework of abstract interpretation provides a standard recipe for constructing algorithms that compute semantic program invariants. Observational abstract interpreters extend this framework to allow for computing hypothetical invariants that are valid only under specific hypotheses about program environments. These hypotheses are inferred from observations of program behavior and are embedded as dynamic/run-time checks in the program to ensure the validity of program proofs that use hypothetical invariants.