TY - GEN
T1 - Transformation-Equivariant Representation Learning with Barber-Agakov and InfoNCE Mutual Information Estimation
AU - Sinaga, Marshal Arijona
AU - Basarrudin, T.
AU - Krisnadhi, Adila Alfa
N1 - Publisher Copyright:
© 2022 by SCITEPRESS – Science and Technology Publications, Lda.
PY - 2022
Y1 - 2022
N2 - The success of deep learning on computer vision tasks is due to the convolution layer being equivariant to the translation. Several works attempt to extend the notion of equivariance into more general transformations. Autoencoding variational transformation (AVT) achieves state of art by approaching the problem from the information theory perspective. The model involves the computation of mutual information, which leads to a more general transformation-equivariant representation model. In this research, we investigate the alternatives of AVT called variational transformation-equivariant (VTE). We utilize the Barber-Agakov and information noise contrastive mutual information estimation to optimize VTE. Furthermore, we also propose a sequential mechanism that involves a self-supervised learning model called predictive-transformation to train our VTE. Results of experiments demonstrate that VTE outperforms AVT on image classification tasks.
AB - The success of deep learning on computer vision tasks is due to the convolution layer being equivariant to the translation. Several works attempt to extend the notion of equivariance into more general transformations. Autoencoding variational transformation (AVT) achieves state of art by approaching the problem from the information theory perspective. The model involves the computation of mutual information, which leads to a more general transformation-equivariant representation model. In this research, we investigate the alternatives of AVT called variational transformation-equivariant (VTE). We utilize the Barber-Agakov and information noise contrastive mutual information estimation to optimize VTE. Furthermore, we also propose a sequential mechanism that involves a self-supervised learning model called predictive-transformation to train our VTE. Results of experiments demonstrate that VTE outperforms AVT on image classification tasks.
KW - Barber-Agakov
KW - InfoNCE
KW - Mutual Information Estimation
KW - Representation Learning
KW - Transformation-Equivariant
UR - http://www.scopus.com/inward/record.url?scp=85174619381&partnerID=8YFLogxK
U2 - 10.5220/0010880400003122
DO - 10.5220/0010880400003122
M3 - Conference contribution
AN - SCOPUS:85174619381
SN - 9789897585494
T3 - International Conference on Pattern Recognition Applications and Methods
SP - 99
EP - 109
BT - ICPRAM 2022 - Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods, Volume 1
A2 - De Marsico, Maria
A2 - Sanniti di Baja, Gabriella
A2 - Fred, Ana L.N.
PB - Science and Technology Publications, Lda
T2 - 11th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2022
Y2 - 3 February 2022 through 5 February 2022
ER -