Nature human behaviour | 2021

Beauty is in the eye of the machine.

 
 

Abstract


Artificial intelligence (AI) has made rapid strides in a wide range of visual tasks, including recognition of objects and faces, automatic diagnosis of clinical images, and answering questions about images. More recently, AI has also started penetrating the arts. For example, in October 2018, the first piece of AI-generated art came to auction, with an initial estimate of US$ 10,000, and strikingly garnered a final bid of US$ 432,500 (Fig. 1). The portrait depicts a portly gentleman with a seemingly fuzzy facial expression, dressed in a black frockcoat with a white collar. Appreciating and creating a piece of art requires a general understanding of aesthetics. What are the nuances, structures, and semantics embedded in a painting that can provide us with an aesthetically pleasing sense? Appraisal of aesthetic value by humans is fickle and subjective. However, empirical investigations have long shown that there exists some level of universality in art appreciation across cultures and history1. Such correlations lead to the question of whether there are any rules governing our preferences for visual arts. Previous studies have hinted at the use of feature integration frameworks to predict aesthetic value preferences2. However, previous work has focused on the effect of either one specific visual feature or a pool of complex features, without a clear delineation of how these features weigh towards generating value judgements. The high visual complexity of each piece of art and the large heterogeneity among different pieces of art make the selection of relevant features for value judgements challenging. Now, a new study by Iigaya et al. in Nature Human Behaviour3 represents a critical step forward in elucidating the features underlying aesthetic value preference. Iigaya and colleagues asked both in-lab and online participants to report how much they liked a piece of artwork on a four-point scale. In a first attempt, the authors used a set of 13 hand-crafted visual characteristics, such as hue, contrast, and presence of people, to successfully assess aesthetic values. Next, to circumvent the need for human intervention in the selection of features, Iigaya et al. used a deep convolutional neural network (DCNN)4. DCNNs are at the heart of the recent revolution in AI. They consist of neuron-like units organized into multiple layers that sequentially process an image to extract an increasingly richer and more robust set of visual features5. Each unit receives inputs from a myriad of other units, and the connection strengths are governed by weights that can be learned from examples via training. Here the authors utilized a network that had been pre-trained on an object-recognition task involving labels of about 1,000,000 images from a dataset known as ImageNet6. The resulting DCNN model constitutes an initial approximation to the cascade of computations that take place along the ventral visual cortex. The authors then fine-tuned the DCNN model by adjusting the weights of only the last fully connected layers, using only a subset of the images ranked by human participants. This approach allowed the model to learn what is aesthetically pleasing without imposing any human wisdom or biases about specific visual features. Thus, the DCNN model learned from examples to automatically distinguish images with higher versus lower value. The experimental results by Iigaya et al. showed, surprisingly, that the DCNN model not only succeeded in predicting subjective values, but was also able to implicitly capture the 13 hand-crafted features chosen in the preliminary analyses. As observed in other studies, higher layers in the network, which represent more complex visual features, yielded a better predictability of subjective value. In summary, this DCNN model represents a breakthrough in our understanding of how humans might make aesthetic value judgements. The results suggest a host of intriguing avenues for further studies. Sceptics may argue that computational models lack sentience, that machines do not have their own preferences, and that ultimately computers cannot understand human values. After all, our aesthetic values might represent a complex combination of evolutionary learning, cultural influences, and individuality. As in many other cases, the answer to the question of whether aesthetic value preferences are dictated by nature or nurture probably involves a mixture of both. Iigaya et al. show that a model based purely on visual features can capture aspects of universal aesthetic judgments. Yet, the model had to be trained with human-provided examples in a supervised fashion. Maybe one day it will be possible to develop a model that is subject to the same types of visual experiences that humans go through in a largely unsupervised manner. These models may ‘grow up’ playing with Fig. 1 | “Edmond de Belamy, from La Famille de Belamy”. The first piece of artwork created by artificial intelligence garnered US$ 432,500 at Christie’s auction in October, 2018.

Volume None
Pages None
DOI 10.1038/s41562-021-01125-5
Language English
Journal Nature human behaviour

Full Text