Low-shot learning for object recognition, detection, and segmentation
MetadataShow full item record
Deep Neural Networks are powerful at solving classification problems in computer vision. However, learning classifiers with these models requires a large amount of labeled training data, and recent approaches have struggled to adapt to new classes in a data-efficient manner. On the other hand, the human brain is capable of utilizing already known knowledge in order to learn new concepts with fewer examples and less supervision. Many meta-learning algorithms have been proposed to fill this gap but they come with their practical and theoretical limitations. We review the well-known bi-level optimization as a general framework for few-shot learning and hyperparameter optimization and discuss the practical limitations of computing the full gradient. We provide theoretical guarantees for the convergence of the bi-level optimization using the approximated gradients computed by the truncated back-propagation. In the next step, we propose an empirical method for few-shot semantic segmentation: instead of solving the inner optimization, we propose to directly estimate its result by a general function approximator. Finally, we will discuss extensions of this work with the focus on weakly-supervised object detection when full supervision is not available for the few training examples.