TY - JOUR
T1 - Color illusions also deceive CNNs for low-level vision tasks
T2 - Analysis and implications
AU - Gomez-Villa, A.
AU - Martín, A.
AU - Vazquez-Corral, J.
AU - Bertalmío, M.
AU - Malo, J.
N1 - Publisher Copyright:
© 2020 Elsevier Ltd
PY - 2020/11
Y1 - 2020/11
N2 - The study of visual illusions has proven to be a very useful approach in vision science. In this work we start by showing that, while convolutional neural networks (CNNs) trained for low-level visual tasks in natural images may be deceived by brightness and color illusions, some network illusions can be inconsistent with the perception of humans. Next, we analyze where these similarities and differences may come from. On one hand, the proposed linear eigenanalysis explains the overall similarities: in simple CNNs trained for tasks like denoising or deblurring, the linear version of the network has center-surround receptive fields, and global transfer functions are very similar to the human achromatic and chromatic contrast sensitivity functions in human-like opponent color spaces. These similarities are consistent with the long-standing hypothesis that considers low-level visual illusions as a by-product of the optimization to natural environments. Specifically, here human-like features emerge from error minimization. On the other hand, the observed differences must be due to the behavior of the human visual system not explained by the linear approximation. However, our study also shows that more ‘flexible’ network architectures, with more layers and a higher degree of nonlinearity, may actually have a worse capability of reproducing visual illusions. This implies, in line with other works in the vision science literature, a word of caution on using CNNs to study human vision: on top of the intrinsic limitations of the L + NL formulation of artificial networks to model vision, the nonlinear behavior of flexible architectures may easily be markedly different from that of the visual system.
AB - The study of visual illusions has proven to be a very useful approach in vision science. In this work we start by showing that, while convolutional neural networks (CNNs) trained for low-level visual tasks in natural images may be deceived by brightness and color illusions, some network illusions can be inconsistent with the perception of humans. Next, we analyze where these similarities and differences may come from. On one hand, the proposed linear eigenanalysis explains the overall similarities: in simple CNNs trained for tasks like denoising or deblurring, the linear version of the network has center-surround receptive fields, and global transfer functions are very similar to the human achromatic and chromatic contrast sensitivity functions in human-like opponent color spaces. These similarities are consistent with the long-standing hypothesis that considers low-level visual illusions as a by-product of the optimization to natural environments. Specifically, here human-like features emerge from error minimization. On the other hand, the observed differences must be due to the behavior of the human visual system not explained by the linear approximation. However, our study also shows that more ‘flexible’ network architectures, with more layers and a higher degree of nonlinearity, may actually have a worse capability of reproducing visual illusions. This implies, in line with other works in the vision science literature, a word of caution on using CNNs to study human vision: on top of the intrinsic limitations of the L + NL formulation of artificial networks to model vision, the nonlinear behavior of flexible architectures may easily be markedly different from that of the visual system.
KW - Artificial neural networks
KW - Efficient representation
KW - Natural image statistics
KW - Visual illusions
UR - http://www.scopus.com/inward/record.url?scp=85090128391&partnerID=8YFLogxK
U2 - 10.1016/j.visres.2020.07.010
DO - 10.1016/j.visres.2020.07.010
M3 - Article
C2 - 32896717
AN - SCOPUS:85090128391
SN - 0042-6989
VL - 176
SP - 156
EP - 174
JO - Vision Research
JF - Vision Research
ER -