With the rapid development of artificial intelligence technology, affective computing has become an emerging research field that aims to develop systems that can recognize, interpret and simulate human emotions. This interdisciplinary field combines computer science, psychology and cognitive science to endow machines with emotional intelligence, enabling them to understand and respond to human emotional states.
The core goal of affective computing is to enable machines to interpret human emotional states and adjust their behavior accordingly so that they can give appropriate responses.
Rosalind Picard's 1995 paper "Affective Computing" and her 1997 book of the same name marked the modern beginning of the field. What Picard emphasizes is that emotions are not only a companion to thinking, but also an important component of intelligence. As technology develops, many studies have begun to focus on how to detect emotional information through passive sensors, such as using cameras to capture facial expressions, body posture, and gestures.
Machine learning techniques are effective in extracting meaningful emotional patterns from the collection of various sensory data, such as speech recognition and natural language processing.
Identifying emotions is an important task in affective computing. On the one hand, data collection usually relies on passive sensors, and on the other hand, this data also needs to be identified and classified through machine learning technology. Here, AI’s capabilities are becoming increasingly human-like, making them even more accurate than ordinary humans in some cases. For example, through the understanding of human emotions, AI can simulate empathy and understanding, thereby enhancing interpersonal interactions between people and machines.
In one area of research called affective computing, researchers focus on designing computing devices that have emotional capabilities. Technically, the current trend is to apply emotion simulation to conversational agents, which makes human-computer interaction richer and more flexible. Marvin Minsky, a famous pioneer of artificial intelligence, once pointed out that emotions are not fundamentally different from thinking processes, which is further confirmed in affective computing.
Future digital humans or virtual human systems will aim to simulate human emotional responses, including facial expressions and gestures, as well as natural reactions to emotional stimuli.
There are two main ways to describe emotions in cognitive science and psychology: continuous and categorical. The difference between these two methods has given rise to a variety of machine learning regression and classification models to support AI emotion recognition. Different emotion recognition technologies are applied to speech, which can analyze the user's emotional state from speech features such as rhythm, pitch, and pronunciation clarity.
The emotional characteristics of speech, such as fear, anger or happiness, are crucial to the development of affective computing technology. These characteristics can be used to perform emotion recognition by computing and analyzing audio features.
In emotion recognition, executing corresponding algorithms requires establishing a stable database or knowledge base. Various classifiers, such as linear discriminant analyzer (LDC), support vector machine (SVM), etc., are widely used to improve the accuracy of emotion recognition.
Although the current systems’ reliance on emotion recognition still fully demonstrates the importance of data, they still face many challenges. Most emotion data are obtained from performers and thus may not fully capture the diversity of natural emotions. In order to better apply these emotion recognition technologies in practical applications, researchers continue to explore methods for constructing natural data to improve the accuracy and applicability of emotion recognition.
Although the technology of facial emotion recognition has been continuously improving, there are still many challenges. For example, studies have found that many trained algorithms perform poorly in recognizing natural expressions, and the naturalness and unnaturalness of facial expressions cause some confusion between emotional categories. Moreover, the traditional facial action coding system (FACS) is limited to static representation and cannot capture dynamic emotions.
The real challenge lies in how to accurately identify the underlying emotions in massive amounts of data, which are more difficult to discern in informal social situations.
Although today's algorithm technology is improving, many researchers are still pursuing more accurate emotion recognition and response strategies, hoping that in the near future, AI will not only be able to recognize emotions, but also truly understand and respond to human emotional needs. In the future, as technology continues to improve, the understanding and interaction between humans and machines will become more seamless and natural. Will this lead to changes in the emotional relationship between people and machines?