Chapter 2 Background 6
3.5 Infer missing information
The mapping table would be solution for how to infer the missing information. By the relationship between Cx and bx, the behavior sub-model would infer the corresponding behaviors for the device. And so does the missing information sub-model.
The sub-models with inferring function still deduce the wrong solution, because the faulty mapping table. But by the weight column, the sub-models would refine the mapping table.
The probability of inferring the wrong solution would decrease, by refining the weight over and over.
3.6 Summary
We introduce the definition and form of tasks included notations to express users'
commands. We still defined the two mapping tables, and make a description of the sub-models in function and flowchart.
For different elements in a task, we design the corresponding sub-models to deal with different sceneries. The sub-models would check the elements in a task and missing information, and do the actions for different sceneries.
CxC { }Cx
C C Cx
bx
Figure 3.5:Task Generator
Cx
Cx
Cx
Figure 3.6: Component Manager
b
xix x { }bxi
bxi
Cx
bxib
x
bxi
b
xib
xib
xib
xibxi
Cx
Figure 3.7:Behavior Manager
Start
Chapter 4 Implementation
In the chapter we will discuss our implementations of REM based on our proposed model.
4.1 Environment
We divide into two parts to explain how we implement. One is hardware, the other is software included the tools we applied to.
4.1.1 Hardware
In hardware, the CPU is Intel C2D E7500, it's a dual-core CPU. The RAM is 2048 MB.
And we use keyboard as input, and the output is monitor. In fact, a single-core CPU still satisfies the demand of our model.
There is a list about the hardware. (Table 4.1)
4.1.2 Software
For general environment,all tools we applied work normally on Windows XP. We have to simulate Windows XP with virtual machine. So we use VMware 3.0.1 to construct Windows
Table 4.1: Hardware List CPU Intel C2D E7500
RAM 2048MB
Input Keyboard output Monitor
Table 4.2:
XP. And the host operated system is Windows 7
The program editor is eclipse and its version is Helios Service Release 2. But there is no C/C++ compiler in eclipse. Before writing the program, we must install the compiler gcc.
At first, the program language of Natural Language ToolKit (NLTK) with distributions for Windows, Mac OSX and Linux is Python . So, we have to write a simple program to apply NLTK. The version of Python is 2.6 . To use the NLTK library,we need to import the head file. And the version of OAA library is 2.3.2 since June, 2007 .
The main program language we applied to complete all components is C++.
4.2 Open Agent Architecture
We would introduce the OAA in this section. We have to introduce the concept of agents at first. And then we would dig in OAA.
4.2.1 Agent-Based Model
Agent-based model is based on Artificial Intelligence (AI) and focus on dialogue system as cooperation between intelligent agents. All of these approaches for agents do focus on
"Goal" , "Solution" and "Event". "Event" means the user's request, "Goal" is the expected action for user's request and "Solution" is the actual action for user's request.
When the agents what are build-in dialogue system receive "Event", the agents would find
"Solution" and send "Goal" to another agent or user. There are not only one "Solution" for a
"Event" , and the "Solution" will change according to some parameters.
In agent-based model shows as Figure. 4.1, a small black point represents a agent with some function. The gray ellipse is the external environment, and the other ellipses in the big gray ellipse are perceived by agents. A agent is responsible for perceiving a small part of environment, that means function of agents is not strong. If there are two agents, one of them is in charge of handling what kind of home appliance user operates and the other one is responsible for searching which one to be operated.
Figure 4.1: Agent-Based Model
4.2.2 Structure of OAA
The OAA is shown as Figure. 4.2. The main component in OAA is facilitator, being responsible for distributing the task to the specified agent. All agents must register their solutions in facilitator, or the facilitator would not pass the task the the agent without
registering. The Interagent Communication Language (ICL) developed by SRI is the form of communication between agetns and he facilitator.
In 1996, the Agent Development Toolkit (ADT) [21] is proposed as a IDE. And the OAA is wide range to be applied. As Section 2 mentioned, the MIMUS applied the OAA in its architecture.
Figure 4.2: Open Agent Architecture
4.3 Structure of Proposed Mode with OAA
We introduce the structure of our model with the hybrid approach, as shown Figure 4.3.
As mentioned in Section 3, the NLP would analysis the sentence from the user. The task generator extracts the tagged word from the result form the NLP. The facilitator would pass the tagged word to the appropriate manager.
At first, we have to launch the facilitator, and we launch the component manager and the behavior manager, as shown Figure 4.4 The behavior manager registers its function in
facilitator.
There are two managers invoked in runtime. as shown Figure 4.5. Obviously, the
Figure 4.3:Structure of Proposed Model with OAA
facilitator accepts the requests of registering from the two managers.
4.4 summary
In summary, we introduce the hardware and software environment in our design We introduce the architecture of Open Agent Architecture (OAA),and how we use the structure of OAA to design and implement REM in mentioned environment.
Figure 4.4:Facilitator with One Manager
Figure 4.5: Facilitator with Two Managers
Chapter 5 Analysis
In this chapter, we would introduce the actions of REM in four case. In Section 5.2, we discuss the effect of REM in the dialogue manager with different approach.
5.1 Case Study
According the task we defined before, we list several situations about the missing information of the task. we would continue using the notation Cx, bxand Dxto composed a task.
We denote the verbs to bxi and nouns to Cxbased on our model. We list all possible cases.
I. Task with Cx II. Task without Cx
III. Task with bx IV. Task without bx
There is a table as show (Table 5.1)
We follow the definition of degree of complexity allocated to each task.[17] A simple task includes only one bxi, a complex task requires more than one bxi. On the other hand, the task what remembered the previous information is complex, as referential relations.[22][23]
We analysis four cases. In our proposed model, we can handle more than one bxi. In following case, we just discuss the situation with one bxi.
In fact, we can break a task with more than one bxiinto several sub-task with one bxi. On
Table 5.1: Case Study NP
I II
VP III Case 1 Case 2 IV Case 3 Case 4
the other side, the command sending to devices is depended on bxi not previous device information. So, there is no need to analysis overly complex tasks in this paper.
The situation of lack of Dxis similar with lack of Cx. So, we only discuss the lack of Cx.
5.1.1 Case 1
In case 1, our system receives the sentence with Cxand bxi. In principles, the actions of system are independent because the independence of behaviour manager and component manager. So there is a new bxi , the system registers the bxiin dataset and update the device ID of bxi. And there is a new Cx, the system registers the Cxin dataset and update the device ID of bxi.
As mentioned above, we analysis four situation as show (Table 5.2)
Existing bxi and Existing Cx It's a general situation with complete sentence. All bxi and Cx
in user command existed in dataset. Because of this, our system acts normally.
Existing bxi and new Cx If there is only Cx not in the dataset, our system still registers the Cx and update device ID in existed bxi. Next,our system updates the device ID of bxi. In doing so, our system can judge the logic of bxi next time.
New bxi and Existing Cx If there is only bxi not in the dataset, our system still registers the bxi and update device ID in bxi. But, no need to register the Cx. The situation is very similar to the previous.
Table 5.2: Case 1
I Cxin NPs
Existing Cx New Cx
III bxi in VPs Existing bxi No register No update Register Cx Update de-vice ID of bxi
New bxi Register bxi Update de-vice ID of bxi
Register Cx and bxi
Register device ID in bxi
Table 5.3: Example of Case 1 Turn on the TV in the bedroom [Turn on] is an existing bxi , [TV] is an existing Cx
No register No update [Turn on] is an new bxi , [TV] is an existing Cx
Register [Turn on]
Update device ID in [Turn on]
[Turn on] is an existing bxi, [TV] is an new Cx Register [TV]
Update device ID in [Turn on]
[Turn on] is an new bxi , [TV] is an new Cx Register [Turn on] and [TV]
Update device ID in [Turn on]
New bxi and new Cx Even the sentence includes bxi and Cx, the bxiand Cxare not in dataset.
Our system not only registers bxiand Cx in dataset but also update the device ID in bxi. In doing so, our system can distinguish the bxi and Cxnext time.
We take a t for a example. 5.3
5.1.2 Case 2
In this case,the dominated factor is the Cx. The factor Cxwould dominate the action of our system. If there is a new Cx, the system have to register the Cx. On the other hand, the system loads the bxi when the Cxis existed in dataset. In fact, We define a component not only deviceCxbut also description of device Dx, including location and features. We use Cxand Dx to describe a complete component.
The point of case 2 is system how to handle the situation that there is not any bxi in user command but Cx. All decision of case 2 are up to Cx. We take a sentence for a example. 5.5
Table 5.4: Case 2
II Cxin NPs
Existing Cx New Cx
III bxi in VPs NULL Load bxi according to user file Register Cx NULL Load bxi according to user file Register Cx
Table 5.5: Example of Case 2 TV in the bedroom
[TV] is an existing Cx Load bxi according to user preference [TV] is an new Cx Register [TV]
And there is a table as show (Table 5.4)
5.1.3 Case 3
In this case, we pay attention on bxi -- lack of device type, only behaviour, as show (Table 5.6). The user's input includes only behaviour bxi. we can expect the bxidominates the flow of our system.
There are two possible action. First, user gives a bxiwhich the system can't realize. Our system registers the bxi without device ID,because there's no Cx in user's input. Second, bxi
form user's sentence exists in dataset. Our system can load the existed Cx from device ID of bxi according to the behaviour mapping table.
We take a sentence for a example. 5.7
Table 5.6: Case 3
I Cxin NPs
NULL NULL
III bxi in VPs Existing bxi Load device ID from bxi Load device ID from bxi New bxi Register bxi,but no device ID Register bxi,but no device ID
Table 5.7: Example of Case 3 Turn it on in the bedroom
[Turn on] is an existing bxi Load device ID from [Turn on]
[Turn on] is an new bxi Register [Turn on]
5.1.4 Case 4
There is a situation we do not handle. A sentence without any bxi and Cxis not a legal command. Because lack of information can not be a complete command,we preclude this situation.
5.2 Discussion
We would discuss the dialogue systems with REM or without REM. As we mentioned Section ??, the REM would fill the missing information in a task. The dialogue manager with REM would receive the task with Cx, bxand Dx, if REM could find the solution. The dialogue manager without REM would query database or ask user for missing information.
5.2.1 Finite-sate approach with REM
The dialogue manager in finite-state approach needs the a series of ordered words or phrases to be input. The dialogue manager with FSM should check the elements in task in ordered. If the input is not intact, the dialogue manager may stuck.
The REM would deal with the missing information, the dialogue manager could receive the task with full information. The dialogue manager with REM could decrease the size of dialogue and work well.
5.2.2 Frame-based approach with REM
The dialogue manager with frame-based approach fills the designed slots. The element in a task which we determined would be filled in the designed slots. The dialogue manager with frame-based approach would ask users for missing information.
The sub-models in REM could guess the missing information. The dialogue manager could receive the result from REM to fill the slots, not to ask users for missing information.
Chapter 6 Conclusion
The REM could decrease the complex process querying data of dialogue manager about missing information. The dialogue manager does not ask user after the REM.
The basic concept of REM is finite state machine with a series of judging. Because the form of the task is fixed, we apply the finite state machine in determine mechanism even the finite state machine with shortage in scalability. The sub-models deal with the missing information and the unknown element in a task.
If missing information happens, the REM works based on the information of task, as device or behavior. The REM would choose different action based on the device or behavior in the task. With behavior mapping table or component mapping table, the REM could find the corresponding solution to the task with missing information.
By the determined task, we design the REM with distributed agents for missing
information. The form of a task would assist REM to diagnose the state of a task, sub-models in REM could find the solution.
The Rational Exception Model (REM) would help the dialogue manager to analysis the state of dialogue. The sub-models in REM would query the missing information and determine the logic of behavior for corresponding device with distributed agents. The dialogue manager does not query the missing information with mentioned approached.
Chapter 7 Future Work
Future work in our research is to refine the mapping table with more efficient approach.
We choose the simple approach, weight column in mapping table, to infer the solution.
For this purpose, we could query database on-line and determine the relationship between the keyword we searched and the result. For example,we could search the keyword "turn on", the result would be the short sentence "turn on the radio". The short sentence included the corresponding device "radio" for the keyword "turn on". We can refine our mapping table by determining the relationship between "turn on" and "radio".
References
[1] D. Traum and S. Larsson, ''The information state approach to dialogue management,'' in Current and New Directions in Discourse and Dialogue, 2003, pp. 325--353.
[2] A. Cheyer and D. Martin, ''The open agent architecture,'' Journal of Autonomous Agents and Multi-Agent Systems, vol. 4, no. 1, pp. 143--148, March 2001, oAA.
[3] M. F. McTear, ''Spoken dialogue technology: enabling the conversational user interface,'' ACM Comput. Surv., vol. 34, pp. 90--169, March 2002. [Online]. Available:
http://doi.acm.org/10.1145/505282.505285
[4] M. M. University and M. F. Mctear, ''Modelling spoken dialogues with state transition diagrams: experiences with the cslu toolkit,'' in Proc 5th International Conference on Spoken Language Processing, 1998, pp. 1223--1226.
[5] B. Hansen, D. G. Novick, and S. Sutton, ''Systematic design of spoken prompts,'' in Proceedings of the SIGCHI conference on Human factors in computing systems: common ground, ser. CHI '96. New York, NY, USA: ACM, 1996, pp. 157--164. [Online].
Available: http://doi.acm.org/10.1145/238386.238466
[6] H. Aust, M. Oerder, F. Seide, and V. Steinbiss, ''The philips automatic train timetable information system,'' Speech Commun., vol. 17, pp. 249--262, November 1995. [Online].
Available: http://portal.acm.org/citation.cfm?id=219030.219079
[7] D. S.-H. C. Chin-Han Tsai, ''A study on speech dialogue system and dialogue strategy,'' July 2005.
[8] M. Guo, Y. Liu, and J. Malec, ''A new q-learning algorithm based on the metropolis criterion,'' EEE Trans Syst Man Cybern B Cybern, vol. 34, pp. 2140--3, 2004. [Online].
Available:
http://www.biomedsearch.com/nih/new-Q-learning-algorithm-based/15503510.html
[9] E. Levin, R. Pieraccini, and W. Eckert, ''Using markov decision process for learning dialogue strategies,'' in Proc. ICASSP, 1998, pp. 201--204.
[10] H. machine-dialog Corpora, W. Eckert, E. N�th, H. Niemann, and E.-G.
Schukat-Talamazzini, ''Real users behave weird - experiences made collecting large human-machine-dialog corpora,'' 1995.
[11] S. Sutton, R. Cole, J. D. Villiers, J. Schalkwyk, P. Vermeulen, M. Macon, Y. Yan,
E. Kaiser, B. Rundle, K. Shobaki, P. Hosom, A. Kain, Johan, J. Wouters, D. Massaro, and M. Cohen, ''Universal speech tools: The cslu toolkit,'' in In Proceedings of the
International Conference on Spoken Language Processing (ICSLP, 1998, pp.
3221--3224.
[12] M. F. Mctear, ''Using the cslu toolkit for practicals in spoken dialogue technology,'' in University College London, 1999, pp. 1--7.
[13] C. L. Liu, Elements of discrete mathematics, 1977.
[14] J. Chu-Carroll, ''Mimic: an adaptive mixed initiative spoken dialogue system for
information queries,'' in Proceedings of the sixth conference on Applied natural language processing, ser. ANLC '00. Stroudsburg, PA, USA: Association for Computational Linguistics, 2000, pp. 97--104. [Online]. Available:
http://dx.doi.org/10.3115/974147.974161
[15] J. Chu-Carroll and M. K. Brown, ''An evidential model for tracking initiative in collaborative dialogue interactions,'' User Modeling and User-Adapted Interaction, vol. 8, pp. 215--254, February 1998. [Online]. Available:
http://portal.acm.org/citation.cfm?id=598279.598319
[16] J. G. Amores, G. Pérez, and P. Manchón, ''Mimus: a multimodal and multilingual
dialogue system for the home domain,'' in Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, ser. ACL '07. Morristown, NJ, USA: Association for Computational Linguistics, 2007, pp. 1--4. [Online]. Available:
http://portal.acm.org/citation.cfm?id=1557769.1557771
[17] P. Manchón, C. del Solar, G. Amores, and G. Pérez, ''Multimodal interaction analysis in a smart house,'' in Proceedings of the 9th international conference on Multimodal
interfaces, ser. ICMI '07. New York, NY, USA: ACM, 2007, pp. 327--334. [Online].
Available: http://doi.acm.org/10.1145/1322192.1322249
[18] T. Project., Talk and Look: Linguistic Tools for Ambient Linguistic Knowledg, 2007.
[Online]. Available: www.talk-project.org
[19] P. M. Portillo, G. P. García, and G. A. Carredano, ''Multimodal fusion: a new hybrid strategy for dialogue systems,'' in Proceedings of the 8th international conference on Multimodal interfaces, ser. ICMI '06. New York, NY, USA: ACM, 2006, pp. 357--363.
[Online]. Available: http://doi.acm.org/10.1145/1180995.1181061
[20] N. Chomsky, Syntactic structures, 1957.
[21] A. C. G.-L. L. David L. Martin, ''Development tools for the open agent architecture,'' vol.
PAAM 96. SRI AI center, April 1996.
[22] L. Ahrenberg, A. J�nsson, and N. Dahlb�ck, ''Discourse representation and discourse management for a natural language dialogue system,'' in In Proceedings of the Second Nordic Conference on Text Comprehension in Man and Machine, Taby, 1990.
[23] J. Hawkins, ''Definiteness and indefiniteness: A study in reference and grammaticality prediction,'' 1978.