Domain adaptation via data augmentation
Abstract
Deep learning (DL) models require large labeled datasets for training. Practitioners often need to adapt an existing DL model to a different domain. For instance, a practitioner in a company developing autonomous vehicles may need to adapt an object detection model trained over images collected during the day to handle those obtained during the night. Given the challenges associated with curating a labeled dataset for this domain adaptation task, we seek to automatically transform an existing dataset from one domain to another. This thesis investigates how data augmentation might help cope with the domain adaptation problem. Inspired by other works on domain adaptation, we investigate AMOEBA, a domain adaptation system that leverages a generative adversarial network to convert images from one domain to another. By automating this task, it allows a practitioner to effectively train DL models specialized for their target domain. We propose two techniques for improving the efficacy of AMOEBA : (1) data engineering, and (2) post-processing. Our evaluation of AMOEBA on two domain adaptation problems shows that it is effective in practice. We conclude with a discussion on the limitations of AMOEBA and potential ways to further improve its efficacy.