Springe direkt zu Inhalt

Disputation Sahar Iravani

23.11.2022 | 12:00
Thema der Dissertation:
Interpretable Deep Learning Approaches for Biomarker Detection from High-Dimensional Biomedical Data
Thema der Disputation:
Self-Supervised Learning for Visual Representation
Abstract: Supervised learning (SL) techniques have been applied successfully in training specialized models in scenarios where large amounts of good-quality labeled data are available. But, at the same time, SL is considered a bottleneck for building more intelligent models capable of acquiring new skills. The reason is that labeled data are not available for all tasks, and they can be costly to obtain in some fields. Besides, the representations learned by SL approaches are limited to the features that are specialized for solving a particular supervised task. In contrast, self-supervised learning (SSL) is a representation learning paradigm that learns rich representation without the need for labeled data. SSL enables artificial intelligent systems to learn from orders of magnitude more data, which is the key to obtaining rich representations, through adopting pseudo-labels instead of labeled data. The learned representations can be then used for several downstream tasks. SSL has had great success in advancing the field of natural language processing. Recently, with the advent of new techniques, it has been demonstrated that SSL can excel at computer vision tasks in complex, real-world settings, yielding high accuracy on a diverse set of vision tasks [1-2]. In this talk, I will give an overview of SSL approaches for learning visual representation. In particular, I will present details of a recent highly cited approach “Emerging Properties in Self-supervised Vision Transformers” [3] which combines the knowledge distillation approach with vision transformers with the least supervision in comparison with prior works. This method outperforms many other state-of-the-art methods, including supervised methods, in transfer learning. Additionally, it provides interpretable features, suggesting that the representation learned by this model is capable of a higher level of vision task understanding. [1] Chen, T., Kornblith, S., Norouzi, M. and Hinton, G., 2020, November. A simple framework for contrastive learning of visual representations. In International conference on machine learning (pp. 1597-1607). [2] Grill, J.B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M. and Piot, B., 2020. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33, (pp. 21271-21284). [3] Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P. and Joulin, A., 2021. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9650-9660).

Zeit & Ort

23.11.2022 | 12:00

(Zuse Institut Berlin, Takustr. 7, 14195 Berlin)