• Login
    View Item 
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Detection and incremental object learning in videos

    Thumbnail
    View/Open
    HUMAYUN-DISSERTATION-2018.pdf (58.97Mb)
    Date
    2018-12-14
    Author
    Humayun, Ahmad
    Metadata
    Show full item record
    Abstract
    Unlike state-of-the-art batch machine learning methods, children have a remarkable facility for learning visual representations of objects through a combination of self-directed visual exploration and access to a sparse supervisory signal in the form of spoken object names. Studies of infant development have shown that children are able to locate, track, and differentiate novel object instances from a continuous sequence of visual inputs without requiring dense object labels. This thesis develops methods for on-line visual learning in video which are inspired by infant object learning and are enabled by recent advances in deep learning architectures. We introduce two methods and a dataset to support this thesis. First, we demonstrate a convolutional neural network for detecting and tracking objects in continuous video. These detections are generated by harnessing the temporal continuity of the visual world, and can be used as space-time trajectories for objects in the scene. This method is capable of generating space-time proposals from streaming video, which presents a starting point for on-line weakly-supervised learning. We show that a network can be trained to detect objects more reliably when given a sequence of frames, while being 2.5 times faster when compared to traditional single frame detectors. The second part of this thesis studies the incremental learning paradigm in a setting similar to an infant's play environment. To mimic an environment where children pick up, examine, and put down different objects, we develop a novel data generation pipeline which can produce an arbitrary number of learning exposures composed of videos of rotating objects. Enabled by this data generator, we introduce a novel object learning problem, known as self-directed incremental learning, where an agent needs to decide whether a learning exposure corresponds to a previously-seen object or a new object. We present a simple solution to this problem, which has the ability to work with 100 unique objects shown repeatedly to the learner. From our extensive experiments we conclude that the effect of catastrophic forgetting, the main obstacle in adapting batch learning algorithms to an incremental learning setting, is diminished when learners are repeatedly exposed to different views of the same object.
    URI
    http://hdl.handle.net/1853/61614
    Collections
    • College of Computing Theses and Dissertations [1071]
    • Georgia Tech Theses and Dissertations [22401]

    Browse

    All of SMARTechCommunities & CollectionsDatesAuthorsTitlesSubjectsTypesThis CollectionDatesAuthorsTitlesSubjectsTypes

    My SMARTech

    Login

    Statistics

    View Usage StatisticsView Google Analytics Statistics
    facebook instagram twitter youtube
    • My Account
    • Contact us
    • Directory
    • Campus Map
    • Support/Give
    • Library Accessibility
      • About SMARTech
      • SMARTech Terms of Use
    Georgia Tech Library266 4th Street NW, Atlanta, GA 30332
    404.894.4500
    • Emergency Information
    • Legal and Privacy Information
    • Human Trafficking Notice
    • Accessibility
    • Accountability
    • Accreditation
    • Employment
    © 2020 Georgia Institute of Technology