Modeling affective system and episodic memory for social companions
Date of Issue2017-06-30
School of Computer Science and Engineering
Institute for Media Innovation
Social companions, including virtual humans and social robots, have attracted increasing attention in recent years. The social companions interact with users in a social and humanlike way that makes them believable and trustable. Recent studies have investigated the problem that people tend to feel bored and lose interest rapidly in the first few interactions with social companions. Thus, many studies have started to improve the capabilities of the social companions to preserve user engagement over long-term period. Such companions are required to display consistent personalities in interactions and remember what has happened in the past with particular users. Therefore, our research here focuses on designing the affective system and the episodic memory model, and studies how to integrate them compatibly with social companions to improve user engagement in interactions. Specifically, we have studied (1) how to design a general system architecture for social companions with necessary modules and connections; (2) how to design an affective system to express specific personalities by organizing emotional behaviors in particular patterns, (3) how to design an episodic memory model with more comprehensive similarity measure that can both improve the agent performance in tasks and the user engagement in social companion interactions, and (4) how to construct a benchmark for episodic memory models designed for free-dialog based social companion interactions. Firstly, we propose a general system architecture for social companions to achieve engaging interactions. The architecture can be shared by our virtual human and our social robot. It is composed of three main components: perception, internal processing and action. Each component contains multiple interconnected modules. The affective system and the episodic memory are two important modules in this architecture which model human-like behaviors and intelligence. The communication and synchronization among the modules are deliberated in this work. Secondly, we propose the personality-characterized mood dynamics (PCMD) model to achieve consistent emotional behavior patterns that reflect particular personalities of a social companion. To connect static personalities and dynamic behaviors, we introduce the mood as an intermediate concept. With well-designed rules of mood-driven behaviors, we transform the problem of personifying behavior patterns to the problem of personifying mood dynamics. This is then formulated as an optimization problem, the objective function of which is constructed in a way that the overall mood converges to the personality after sufficient interactions. Thirdly, we propose a mixed-correlation-analysis based episodic memory (MCAEM) model to improve memory retrieval accuracy. Specifically, we make analysis of correlations of memory elements by considering relations between elements, weights (importance) of attributes and order of events. An overall similarity measure is proposed based on the mixed correlation analysis. The proposed similarity measure can improve retrieval accuracy and thus further support higher-level capabilities. We validate the MCAEM model in two applications. First, the MCAEM model provides agents with better abilities of item finding, weapon selection and decision making which improve their performance in the Unreal Tournament 2004 (UT2004) game. Second, the proposed model has enhanced the user experience in social companion interactions in various aspects. Finally, we construct a benchmark from movie scripts (BMS) to measure the retrieval performance of episodic memory models designed for free-dialog based social companion interactions. We build a dataset with 4,889 episodes that contain 38,058 attribute values and 76,123 events extracted from 106 movie scripts. Besides sentences in subtitles, our episodes also contain sequences of events constituted by attributes like subjects, objects, time, locations and emotion polarities. The benchmark is much more complicated compared to datasets in small domains as a result of considering words as additional attributes. To simulate real-world interactions, the retrieval cues are generated with noise and incompleteness. We analyze the consistency between the results of executing different models on our benchmark and on previous benchmarks. In addition, we also compare the proposed MCAEM model with two previous episodic memory models by evaluating them on our benchmark.