• Login
    View Item 
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    CONTROLLABLE CONTENT BASED IMAGE SYNTHESIS AND IMAGE RETRIEVAL

    Thumbnail
    View/Open
    SANGKLOY-DISSERTATION-2022.pdf (64.86Mb)
    Date
    2022-05-03
    Author
    Sangkloy, Patsorn
    Metadata
    Show full item record
    Abstract
    In this thesis, we address the problem of returning target images that match user queries in image retrieval and image synthesis. We investigate line drawing sketch as the main query, and explore several additional signals from the users that can helps clarify the type of images they are looking for. These additional queries may be expressed in one of the following two convenient forms: 1. visual content (sketch, scribble, texture patch); 2. language content. For image retrieval, we first look at the problem of sketch based image retrieval. We construct cross-domain networks that embed a user query and a target image into a shared feature space. We collected Sketchy Database; a large-scale dataset of matching sketch and image pairs that can be used as training data. The dataset has been made publicly available, and has become one of the few standard benchmarks for sketch-based image retrieval. To incorporate both sketch and language content as a queries, we propose a late-fusion dual-encoder approach, similar to CLIP; a recent successful work on vision and language representation learning. We also collected the dataset of 5,000 hand drawn sketch, which can be combined with existing COCO caption annotation to evaluate the task of image retrieval with sketch and language. For image synthesis, we present a general framework that allows users to interactively control the generated images based on specification of visual features (e.g., shape, color, texture).
    URI
    http://hdl.handle.net/1853/66630
    Collections
    • College of Computing Theses and Dissertations [1156]
    • Georgia Tech Theses and Dissertations [23403]
    • School of Interactive Computing Theses and Dissertations [130]

    Browse

    All of SMARTechCommunities & CollectionsDatesAuthorsTitlesSubjectsTypesThis CollectionDatesAuthorsTitlesSubjectsTypes

    My SMARTech

    Login

    Statistics

    View Usage StatisticsView Google Analytics Statistics
    facebook instagram twitter youtube
    • My Account
    • Contact us
    • Directory
    • Campus Map
    • Support/Give
    • Library Accessibility
      • About SMARTech
      • SMARTech Terms of Use
    Georgia Tech Library266 4th Street NW, Atlanta, GA 30332
    404.894.4500
    • Emergency Information
    • Legal and Privacy Information
    • Human Trafficking Notice
    • Accessibility
    • Accountability
    • Accreditation
    • Employment
    © 2020 Georgia Institute of Technology