Search
Now showing items 1-10 of 29
An image-based approach for 3D reconstruction of urban scenes using architectural symmetries
(Georgia Institute of Technology, 2018-07-23)
In this dissertation, I focus on an important, generalizable and freely available sub-category of semantic information in addressing modern reconstruction challenges: the notion of symmetry. The emphasis in the 3D modeling ...
A computational framework for unsupervised analysis of everyday human activities
(Georgia Institute of Technology, 2008-07-07)
In order to make computers proactive and assistive, we must enable them to perceive, learn, and predict what is happening in their surroundings. This presents us with the challenge of formalizing computational models of ...
Computational methods for creative inspiration in thematic typography and dance
(Georgia Institute of Technology, 2020-08-04)
As progress in technology continues, there is a need to adapt and upscale tools used in artistic and creative processes. This can either take the form of generative tools which can provide inspiration to artists, human-AI ...
Learning embodied models of actions from first person video
(Georgia Institute of Technology, 2017-08-28)
Advances in sensor miniaturization, low-power computing, and battery life have enabled the first generation of mainstream wearable cameras. Millions of hours of videos are captured by these devices every year, creating a ...
Building agents that can see, talk, and act
(Georgia Institute of Technology, 2020-04-25)
A long-term goal in AI is to build general-purpose intelligent agents that simultaneously possess the ability to perceive the rich visual environment around us (through vision, audition, or other sensors), reason and infer ...
Visually grounded language understanding and generation
(Georgia Institute of Technology, 2020-01-13)
The world around us involves multiple modalities -- we see objects, feel texture, hear sounds, smell odors and so on. In order for Artificial Intelligence (AI) to make progress in understanding the world around us, it needs ...
Urban 3D scene understanding from images
(Georgia Institute of Technology, 2018-01-22)
Human vision is marvelous in obtaining a structured representation of complex dynamic scenes, such as spatial scene-layout, re-organization of the scene into its constituent objects, support of each object, etc. We also ...
EvalAI: Evaluating AI systems at scale
(Georgia Institute of Technology, 2018-12-06)
Artificial Intelligence research has progressed tremendously in the last few years. There has been the introduction of several new multi-modal datasets and tasks due to which it is becoming much harder to compare new ...
Interpretation, grounding and imagination for machine intelligence
(Georgia Institute of Technology, 2018-11-08)
Understanding how to model computer vision and natural language jointly is a long-standing challenge in artificial intelligence. In this thesis, I study how modeling vision and language using semantic and pragmatic ...
Encoding 3D contextual information for dynamic scene understanding
(Georgia Institute of Technology, 2020-04-27)
This thesis aims to demonstrate how using 3D cues improves semantic labeling and object classification. Specifically, we will consider depth, surface normals, object classification, and pixel-wise semantic labeling in this ...