Show simple item record

dc.contributor.advisorAlRegib, Ghassan
dc.contributor.authorChen, Min-Hung
dc.date.accessioned2020-09-08T12:43:38Z
dc.date.available2020-09-08T12:43:38Z
dc.date.created2020-08
dc.date.issued2020-05-21
dc.date.submittedAugust 2020
dc.identifier.urihttp://hdl.handle.net/1853/63572
dc.description.abstractVideo has become one of the major media in our society, bringing considerable interests in the development of video analysis techniques for various applications. Temporal Dynamics, which characterize how information changes along time, is the key component for videos. However, it is still not clear how temporal dynamics benefit video tasks, especially for the cross-domain case, which is close to real-world scenarios. Therefore, the objective of this thesis is to effectively exploit temporal dynamics from videos to tackle distributional discrepancy problems for video understanding. To achieve this objective, firstly I identified the benefits for exploiting temporal dynamics for videos, including proposing Temporal Segment LSTM (TS-LSTM) and Inception-style Temporal-ConvNet (Temporal-Inception) for general video understanding, and demonstrating that temporal dynamics can help reduce temporal variations for cross-domain video understanding. Since most previous work only evaluates the performance on small-scale datasets with little domain discrepancy, I collected two large-scale datasets for video domain adaptation: UCF HMDB_full and Kinetics-Gameplay to facilitate cross-domain video research, and proposed Temporal Attentive Adversarial Adaptation Network (TA3N) to simultaneously attend, align and learn temporal dynamics across domains. Finally, to utilize temporal dynamics from unlabeled videos for action segmentation, I proposed Self-Supervised Temporal Domain Adaptation (SSTDA) to jointly align cross-domain feature spaces embedded with local and global temporal dynamics by two self-supervised auxiliary tasks, binary and sequential domain prediction, and demonstrated the usefulness of adapting to unlabeled videos across variations.
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.publisherGeorgia Institute of Technology
dc.subjectDomain adaptation
dc.subjectAction recognition
dc.subjectAction segmentation
dc.subjectSelf-supervised learning
dc.subjectVideo understanding
dc.subjectTransfer learning
dc.subjectUnsupervised learning
dc.subjectTemporal dynamics
dc.subjectDomain discrepancy
dc.subjectTemporal variations
dc.subjectMulti-scale
dc.titleBridging distributional discrepancy with temporal dynamics for video understanding
dc.typeDissertation
dc.description.degreePh.D.
dc.contributor.departmentElectrical and Computer Engineering
thesis.degree.levelDoctoral
dc.contributor.committeeMemberKira, Zsolt
dc.contributor.committeeMemberVela, Patricio
dc.contributor.committeeMemberTsai, Yi-Chang
dc.contributor.committeeMemberDyer, Eva
dc.date.updated2020-09-08T12:43:38Z


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record