• 沒有找到結果。

Organization of the Thesis

Chapter 1 Introduction

1.4 Organization of the Thesis

Figure 1-1 shows the organization of this thesis. In Chapter 2, a novel robotic emotion generation system is developed based-on mood transition model. A robotic mood state

generation algorithm is proposed using a two-dimensional emotional model. An interactive emotional behaviors generation is then proposed to generate an unlimited number of emotional expressions by fusing seven basic facial expressions. In Chapter 3, several human emotion recognition methods are developed to provide user’s emotional state. Here bimodal information fusion algorithm and speech-signal-based emotion recognition method are proposed for human-robot interaction. Simulation and experimental results of the proposed robotic emotion generation system and the proposed human emotion recognition methods are reported and discussed in Chapter 4. Chapter 5 concludes the contributions of this work and provides the recommendations for future research.

Human Emotion Recognition Chapter 3

Bimodal Information Fusion Algorithm

Speech-signal-based Emotion

Recognition Robotic Emotional State

Modeling Chapter 2 Robotic Mood Transition Model

Emotional Behaviors Generation

Artificial Face and Robotic Head

Chapter 4

User’s Emotional State

Fig. 1-1: Structure of the thesis.

Chapter 2

Robotic Emotion Model and Emotional State Generation

Figure 2-1 shows the block diagram of the proposed autonomous emotional interaction system (AEIS). Taking a robotic facial expression as the emotion behavior, the robotic interaction is expected not only to react to user’s emotional state, but also to reflect the mood state of the robot itself. We attempt to integrate three modules to construct the AEIS, namely, user emotional state recognizer, robotic mood state generator and emotional behavior decision maker. An artificial face is employed to demonstrate the effectiveness of the design. A camera is provided to capture the user’s face in front of the robot. The acquired images are sent to the image processing stage for emotional state recognition [49]. The user emotional state recognizer is responsible for obtaining user’s emotional state and its intensity. In this design, user’s emotional state at instant k (UEkn) is recognized and represented as a vector of four

β

α P

P ,

) , ( k k RMk = α β

k

k β

α Δ Δ ,

kn

UE

1

RMk

Fig. 2-1: Block diagram of the autonomous emotional interaction system (AEIS).

emotional intensities: neutral (uenN,k), happy (ueHn,k), angry (uenA,k) and sad (ueSn,k). Several existent emotional intensity estimation methods [50-53] provide effective tools to recognize the intensity of human’s emotional state. Their results can be applied and combined into the AEIS.

In this work, an image-based emotional intensity recognition module (see 4.5) has been designed and implemented for current design of AEIS. The recognized emotional intensity consists of basic emotional categories at each sampling instant and is represented by a value between 0 and 1. These intensities are sent to the robotic mood state generator. Moreover, other emotion recognition modalities and methods (e.g. emotional speech recognition) can also be input to AEIS, only the recognized emotional states contain intensity values between 0 and 1.

In the robotic mood state generator, the recognized user’s emotional intensities are transformed into interactive robotic mood variables represented by (Δαk, Δβk) (see 2.1.1 for detailed description). These two variables represent the way that user’s emotional state influences the robotic mood state transition. Furthermore, the robotic emotional behavior depends not only on user’s emotional state, but also on robot personality and previous mood state. Therefore the proposed method takes into account the interactive robotic mood variables (Δαk, Δβk), previous robotic mood state (RMk-1) and robot personality parameters (Pα , Pβ) to compute current robotic mood state (RMk) (see 2.1.4). Note that the previous robotic mood state (RMk-1) is temporary stored in a buffer. In this work, the current robotic mood state is represented as a point in the two-dimensional (2D) emotional plane. Furthermore, robotic personality parameters are created to describe the distinct human-like personality of a robot.

Based on the current robotic mood state, the emotional behavior decision unit autonomously generates suitable robot behavior in response to the user’s emotion state.

For robotic emotional behavior generation, in response to recognized user’s emotional

intensities, a set of fusion weights (FWi, i=0~6) corresponding to each basic emotional behavior are generated by using a fuzzy Kohonen clustering network (FKCN) [54] (see 2.2).

Similar to human beings, the facial expression of a robotic face is very complex and is difficult to be classified by limited number of categoryes. In order to demonstrate interaction behaviors similar to that of humans, FKCN is adopted to generate an unlimited number of emotional expressions by fusing seven basic facial expressions. Outputs of FKCN are sent to the artificial face simulator to generate the interactive behaviors (facial expressions in this work). An artificial face has been designed exploiting the method in [55] to demonstrate the facial expressions generated in human-robot interaction. Seven basic facial expressions are simulated, including neutral, happiness, surprise, fear, sadness, disgust and anger. The facial expressions are depicted by moving control points determined from Ekman’s model [56]. In the practical interaction scenario, each expression can be generated with different proportions of seven basic facial expressions. The actual facial expression of the robot is generated by summation of each behavior output multiplied by its corresponding fusion weight. Therefore, more subtle emotional expressions can be generated as desired. Detailed design of the proposed robotic mood transition model, emotional behavior generation and image-based emotional state recognition will be described in the following sections.

2.1 Robotic Mood Model and Mood Transition

Emotion is a complex psychological experience of an individual’s state of mind as interacting with people or environmental influences. For humans, emotion involves

“physiological arousal, expressive behaviors, and conscious experience” [57]. Emotional interaction behavior is associated with mood, temperament, personality, disposition, and motivation. In this study, the emotion for robotic behavior is simplified to association with mood and personality. We apply the concept that emotional behavior is controlled by current emotional state and mood, while the mood is influenced by personality. In this thesis, a novel

robotic mood state transition method is proposed for a given human-like personality.

Furthermore, the corresponding interaction behavior will be generated autonomously for a determined mood state.

2.1.1 Robotic Mood Model

A simple way to develop robotic emotional behaviors that can interact with people is to allow a robot to respond emotional behaviors by mimicking humans. In human-robot emotional interaction, users’ emotional expressions can be treated as trigger inputs to drive the robotic mood transition. Furthermore, transition of robotic mood depends not only on user’s emotional states, but also on the robot mood and personality of itself. For a robot to interact with several individuals or a group of people, users’ current (at instant k) emotional intensities (UEkn) are sampled and transformed into interactive mood variables Δαk and Δβk to represent how user’s emotional state influences the variation of robotic mood state transition.

From the experience of emotional interaction of human beings, a user’s neutral intensity, for instance, usually affects the arousal and sleepiness mood variation directly. Thus, the robotic mood state tends to arousal while the user’s neutral intensity is low. Similarly, the user’s happiness, anger and sadness intensities affect the pleasure-displeasure axes. Thus, user’s happy intensity will lead robotic mood into pleasure. On the other hand, the robotic mood state behaves more displeasure while user’s angry and sad intensities are high. Based on the above observations, a straightforward case is designed for the interactive robotic mood variables (Δαk, Δβk), which represent the reaction from current users’ emotional intensities on the pleasure-arousal plane, such that:

= +

emotional intensities. By using (2.1)-(2.3), the effect on robotic mood from multiple users’

emotional inputs is represented. However, in this work, only one user is considered for better concentrating on the illustration of the proposed model, i.e. Ns=1 in the following discussion.

It is worth to extend the number of users in the next stage of this study, such that a scenario like the Massachusetts Institute of Technology mood meter [58] can be investigated.

Furthermore, the mapping between facial expressions of interacting human and robotic internal state may be modeled in a more sophisticated way. For example, Δαk can be designed as (ueiA,k+ueiS,k)/2ueiH,k such that alternative (opposite) responses to a user can be obtained.

2.1.2 Robot Personality

McCrae et al. [59] proposed Big Five factors (Five Factor model) to describe the traits of human personality. Big Five model is an empirically based result, not a theory of personality.

The Big Five factors were created through a statistical procedure, which was used to analyze how ratings of various personality traits are correlated for general humans. Table 2-1 lists the Big Five factors and their descriptions [60]. Besides, Mehrabian [61] utilized the Big Five factors to represent the pleasure-arousability-dominance (PAD) temperament model. Through linear regression analysis, the scale of each PAD value is estimated by using the Big Five factors [62]. These results are summarized as three equations of temperament, which includes pleasure, arousability and dominance.

In this work, we adopted Big Five model to represent the robot personality and determine the mood state transition on a two-dimensional pleasure-arousal plane. Hence only two equations are utilized to represent the relationship between robot personality and

Table 2-1: Big five model of personality.

Factor Description Openness Open mindedness, interest in culture.

Conscientiousness Organized, persistent in achieving goals.

Extraversion Preference for and behavior in social situations.

Agreeableness Interactions with others.

Neuroticism Tendency to experience negative thoughts.

pleasure-arousal plane. The reason that we utilize this two-dimensional pleasure-arousal plane rather than the three-dimensional PAD model is based on the Russell’s study. Russell and Pratt [63] indicated that pleasure and arousal each account for large proportions of variance in the meaning of affect terms, each dimension beyond these two accounted for only a tiny proportion. More importantly, these secondary dimensions became more and more clearly interpretable as cognitive rather than emotional in nature. The secondary dimensions thus appear to be aspects of the cognitive appraisal system that has been suggested for emotions.

Here elements of the Big Five factors are assigned based on a reasonable realization of Table 2-1. Referring to [61], the robot personality parameters (Pα, Pβ) are adopted such that:

N A

E

Pα =0.21 +0.59 +0.19 (2.4)

N A

O

Pβ =0.15 +0.3 0.57 , (2.5) where O, E, A and N represent the Big Five factors of openness, extraversion, agreeableness and neuroticism respectively. Therefore the robot personality parameters (Pα, Pβ) are given as the robot personality is known, i.e. O, E, A and N are determined constants. Later we will show that (Pα, Pβ) works as the mood transition weightings on pleasure (α axis) and arousal (β axis) plane.

Note that the conscientiousness of Big Five factors was not used in this design, because this factor only influences the dominance axis of three-dimensional PAD model. In this study, the pleasure-arousal plane of two-dimensional emotional model was applied, so only four out

of five parameters are used to translate the mood transition weighting from the Big Five factors.

2.1.3 Facial Expressions in Two-Dimensional Mood Space

The relationship between mood states and emotional behaviors has been studied by psychologists. Russell and Bullock [64] proposed a two-dimensional scaling on the pleasure-displeasure and arousal-sleepiness axes to model the relationships between the facial expressions and mood state. In this work, the results from [64] are employed to model the relationship between mood state and output emotional behavior. Figure 2-2 illustrates a two-dimensional scaling result for general adult's facial expressions based on pleasure-displeasure and arousal-sleepiness ratings. The scaling result was analyzed by the Guttman-Lingoes smallest space analysis procedure [65]. This two-dimensional scaling procedure provides a geometric representation (stress and orientation) of the relations among the facial expressions by placing them in a space (Euclidean space is used here) of specified dimensionality. Greater similarity between two facial expressions is represented by their closeness in the space. Hence the coordinate in this space can be used to represent the characteristic of each facial expression. As shown in Fig. 2-2, axis α and β represent the amount of pleasure and arousal respectively. Eleven facial expressions are analyzed and located on the plane. The location of each facial expression is represented by a square along with its coordinates. The coordinates of each facial expression is obtained by measuring the location in the figure (interested readers are referred to [64]). The relationship between robotic mood and output behavior, facial expression in this case, is determined.

2.1.4 Robotic Mood State Generation

As mentioned in 2.1.1, both user’s current emotional intensity and robot personality affect the robotic mood transition. The way that robot personality affects the mood transition

Fig. 2-2: Two-dimensional scaling for facial expressions based on pleasure-displeasure and arousal-sleepiness ratings.

is described by robot personality parameters (Pα, Pβ). As given in 2.1.2, these two parameters act as weighting factors on α and β axis respectively. When Pα and Pβ vary, the speed of mood transition in the direction of α and β axes is affected. On the other hand, the interactive mood variables (Δαk, Δβk) give the influence of user’s emotional intensity on the variation of robotic mood state transition. To reveal the relationship between robot personality and mood transition, we suggest to multiply robot personality parameters (Pα, Pβ) with interactive mood variables (Δαk, Δβk). This indicates the influence of robotic mood transition from current user’s emotional intensity as well as robot personality.

Furthermore, the manifested emotional state is determined not only by current robotic emotional variable but also by previous robotic emotional states. The manifested robotic mood state at sample instant k (RMk) is calculated such that:

) ,

( )

,

( k k k1 k k

k RM P P

RM α β = + αΔα βΔβ , (2.6) where (αk,βk)[1, 1] represents the coordinates of robotic mood state at sample instant k on pleasure-arousal plane. By using (2.6), the current robotic mood state is determined and

located on emotional plane. Moreover, the mood transition is influenced by personality, which is reflected by the Big Five factors. After obtaining the manifested robotic mood state (RMk), the coordinate of (αk, βk) will be mapped onto pleasure-arousal plane, and a suitable corresponding facial expression can be determined, as shown in Fig. 2-2.

2.2 Emotional Behavior Generation

After the robotic mood state is determined by using (2.6), a suitable emotional behavior is expected to respond to the user. In this work, we propose a design based on fuzzy Kohonen clustering network (FKCN) to generate smooth variation of interaction behaviors (facial expressions) as mood state transits gradually.

In this approach, pattern recognition techniques were adopted to generate interactive robotic behaviors [25, 54]. By adopting FKCN, robotic mood state, obtained from (2.6), is mapped to fusion weights of basic robotic emotional behaviors. The output will be a linear combination of weighted basic behaviors. In the current design, the basic facial expression behaviors are neutral, happiness, surprise, fear, sadness, disgust and anger, as shown in Fig.

2-1. FKCN is employed to determine the fusion weight of each basic emotional behavior based on the current robotic mood. Figure 2-3 illustrates the structure of the fuzzy-neuro network for fusion weight generation. In the input layer of the network, the robotic mood state (αk, βk) is regarded as inputs of FKCN. In the distance layer, the distance between input pattern and each prototype pattern is calculated such that:

( ) (

T i j

)

j i j i

ij X P X P X P

d = 2= , (2.7)

where Xi denotes the input pattern and Pj denotes the jth prototype pattern (see 2.3.2). In this layer, the degree of difference between the current robotic mood state and the prototype pattern is calculated. If the robotic mood state is not similar to the built-in prototype patterns, then the distance will reflect the dissimilarity. The membership layer is provided to map the distance dij to membership values uij, and it calculates the similarity degree between the input

Robotic emotional state estimator

Fig. 2-3: The fuzzy-neuro network for fusion weight generation.

pattern and the prototype patterns. If an input pattern does not match any prototype pattern, then the similarity between the input pattern and each individual prototype pattern is represented by a membership value from 0 to 1. The determination of the membership value is given such that:

( )

where c denotes the number of prototype patterns, otherwise,

1 1

Note that the sum of the outputs of the membership layer equals 1. Using the rule table (see later) and the obtained membership values, the current fusion weights (FWi, i=0~6) are determined such that:

where wji represents the prototype-pattern weight of ith output behavior. The prototype-pattern weights are designed in a rule table to define basic primitive emotional behaviors corresponding to carefully chosen input states.

2.2.1 Rule Table for Behavior Fusion

In the current design, several representative input emotional states were selected from the two-dimensional model in Fig. 2-2, which gives the relationship between facial expressions and mood states. Each location of facial expression on the mood plane in Fig. 2-2 is used as a prototype pattern for FKCN. Thus, a rule table is constructed accordingly following the structure of FKCN. As shown in Table 2-2, seven basic facial expressions were selected to build the rule table. The IF-part of the rule table is the emotional state of αk and βk of the pleasure-arousal space and the THEN-part is the prototype-pattern weight (wji) of seven basic expressions. For example, the neutral expression in Fig. 2-2 occurs at (0.61, -0.47), which forms the IF-part of the first rule and the prototype pattern for neutral behavior. The THEN part of this rule is the neutral behavior expressed by a vector of prototype-pattern weights (1, 0, 0, 0, 0, 0, 0). The other rules and prototype patterns are set up similarly following the values in Fig. 2-2. Some facial expressions are located at two distinct points on the mood space, both locations are employed, and two rules are set up following the analysis results from psychologist. There are all together 13 rules as shown in Table 2-2. Note that Table 2-2 gives us suitable rules to mimic the behavior of human, since the content of Fig. 2-2 is referenced from psychology results. However, other alternatives and more general rules can

Table 2-2: Rule table for interactive emotional behavior generation.

IF-part

prototype patterns THEN-part weighting

# j αk βk Neutral Happiness Surprise Fear Sadness Disgust Anger

1 0.61 -0.47 1

also be employed. FKCN works to generalize from these prototype patterns all possible situations (robotic mood state in this case) that may happen to the robot. In the FKCN generalization process, proper fusion weights for the corresponding pattern are calculated.

After obtaining the fusion weights of output behaviors from FKCN, the robot’s behavior is determined from seven basic facial expressions weighted by their corresponding fusion weights such that: of neutral, happiness, surprise, fear, sadness, disgust and anger respectively. It is seen that (2.11) gives us a method to generate facial expressions by combining and weighting the seven basic expressions.

The linear combination of basic facial expressions gives a straightforward yet effective way to express various emotional behaviors. In order to make the combined facial expression to be more consistent with human experience, an evaluation and adjusting procedure was carried out by a panel of students in the lab. The features of seven basic facial expressions were adjusted as distinguished as possible to approach human perception experience. Some results of linear combination are demonstrated using a face expression simulator, please refer to 2.2.3.

In fact, human emotional expressions are difficult to be represented by a mathematical model or several typical rules. On the other hand, FKCN is very suitable for building up the emotional expressions. The merit of FKCN is its capacity to generalize the results using limited assigned rules (prototypes). Furthermore, dissimilar emotional types can be designed by adjusting the rules. For the artificial face, facial expressions are defined as the variation of control points, which are positions of eyebrow, eye, lips and wrinkles of the artificial face.

2.2.2 Evaluation of Fusion Weight Generation Scheme

In order to verify the result of fusion-weight generation using FKCN, we applied the rules in Table 2-2 and simulated the weight distribution for various emotional states. The purpose is to evaluate how the proposed FKCN does work to generalize any input emotional

In order to verify the result of fusion-weight generation using FKCN, we applied the rules in Table 2-2 and simulated the weight distribution for various emotional states. The purpose is to evaluate how the proposed FKCN does work to generalize any input emotional