Explaining model decisions and fixing them via human feedback
Ramasamy Selvaraju, Ramprasaath
MetadataShow full item record
Deep networks have enabled unprecedented breakthroughs in a variety of computer vision tasks. While these models enable superior performance, their increasing complexity and lack of decomposability into individually intuitive components makes them hard to interpret. Consequently, when today’s intelligent systems fail, they fail spectacularly disgracefully, giving no warning or explanation. Towards the goal of making deep networks interpretable, trustworthy and unbiased, in this dissertation, we will present my work on building algorithms that provide explanations for decisions emanating from deep networks in order to — 1. understand/interpret why the model did what it did, 2. enable knowledge transfer between humans and AI, 3. correct unwanted biases learned by AI models, and 4. encourage human-like reasoning in AI.