Now showing items 1-4 of 4
Visually grounded language understanding and generation
(Georgia Institute of Technology, 2020-01-13)
The world around us involves multiple modalities -- we see objects, feel texture, hear sounds, smell odors and so on. In order for Artificial Intelligence (AI) to make progress in understanding the world around us, it needs ...
Encoding 3D contextual information for dynamic scene understanding
(Georgia Institute of Technology, 2020-04-27)
This thesis aims to demonstrate how using 3D cues improves semantic labeling and object classification. Specifically, we will consider depth, surface normals, object classification, and pixel-wise semantic labeling in this ...
Explaining model decisions and fixing them via human feedback
(Georgia Institute of Technology, 2020-05-07)
Deep networks have enabled unprecedented breakthroughs in a variety of computer vision tasks. While these models enable superior performance, their increasing complexity and lack of decomposability into individually intuitive ...
Domain adaptation via data augmentation
(Georgia Institute of Technology, 2020-04-28)
Deep learning (DL) models require large labeled datasets for training. Practitioners often need to adapt an existing DL model to a different domain. For instance, a practitioner in a company developing autonomous vehicles ...