Now showing items 1-2 of 2
Building agents that can see, talk, and act
(Georgia Institute of Technology, 2020-04-25)
A long-term goal in AI is to build general-purpose intelligent agents that simultaneously possess the ability to perceive the rich visual environment around us (through vision, audition, or other sensors), reason and infer ...
Visual question answering and beyond
(Georgia Institute of Technology, 2019-09-03)
In this dissertation, I propose and study a multi-modal Artificial Intelligence (AI) task called Visual Question Answering (VQA) -- given an image and a natural language question about the image (e.g., "What kind of store ...