Bookmark: E.Taylor.89.3
Title: Introduction: Dialogue and Multimodal Dialogue
Section: Prologue: Dialogue and Useful Metaphors
Author: Taylor, M. M.
Author: Neel, F.
Author: Bouwhuis, D. G.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 1
Pages: 3-10
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: This book derives from a workshop entitled The Structure of Multimodal Dialogues including Voice which was held in September 1986. The introductory Chapter describes the objectives of the workshop and the organization of the book. Theorists and practitioners of computer-human interaction and students of human dialogue processes met for a week in an attempt to develop both general principles and practical recommendations for multimodal dialogue systems. The book is not a proceedings volume that simply makes public the papers presented, but is intended as a coordinated view of the topics discussed. Authors were asked to amend their presentations to take account of discussions and other presentations, and many did so. The book also includes three overview Chapters based largely on discussions at the workshop, that take an integrated view of some topics that are important in the design of multimodal dialogue systems.

Bookmark: E.Taylor.89.11
Title: Metaphors for Interface Design
Section: Prologue: Dialogue and Useful Metaphors
Author: Hutchins, E.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 2
Pages: 11-28
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Computer system designers and computer users frequently utilize metaphors as organizing structures for dealing with the complexity of behavior of human/computer interfaces. This paper considers four metaphors concerning the mode of interaction between user and machine: the conversation metaphor, the declaration metaphor, the model world metaphor and the collaborative manipulation metaphor. It is argued that the key to the functional properties of an interface lie in the reference relations between the expressions in the interface language and the things to which the expressions refer. The ways in which such metaphors are suggested by advances in I/O technology and the ways they constrain the possibilities we see in technology are discussed. Each of the metaphors discussed promotes a particular type of reference relation. Furthermore, because the computer is a medium in which types of reference relations that are not possible in ordinary language can be realized, the space of interface metaphors is quite likely much larger than we presently imagine it to be.

Bookmark: E.Taylor.89.33
Title: Speech Acts in Multimodal Dialogues
Section: Part 1: User Models and Belief Structures
Author: Perrault, C. R.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 3
Pages: 33-46
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: One of the central issues to be addressed in basing a theory of speech acts on independently motivated accounts of propositional attitudes (belief, knowledge, intentions, and so forth) and action is the specification of the effects of communicative acts. The very fact that speech acts are largely conventional means that specifying their consequences requires taking into consideration possible exceptions to the conventional use of the utterances (e.g., insincere, indirect and ironic uses). Earlier approaches to the problem have attempted to deal with these exceptions by specifying conditions that become true after an utterance, independently of the mental state of the speaker and hearer at the time of utterance. We will argue that there are problems with this approach and sketch a solution to the problem within the framework of Reiter's non-monotonic Default Logic. Default rules are used to embody a simple theories of belief adoption, of action observation, and of the relation between the form of a sentence and the attitudes it is used to convey. This allows quite a simple picture of the relation between certain illocutionary and perlocutionary acts. The emphasis is on uses of declarative sentences.

Bookmark: E.Taylor.89.47
Title: Information Dialogues as Communicative Actions in Relation to Partner Modelling and Information Processing
Section: Part 1: User Models and Belief Structures
Author: Bunt, H. C.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 4
Pages: 47-73
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Dialogues for factual information exchange are considered as sequences of communicative actions, having the purpose of contributing to the desired information transfer, or of ensuring successful communication (or both). The function of a communicative action is defined by its immediate effects on the beliefs and intentions of both participants. A variety of such functions is identified on the basis of both experimental data and assumptions about the logical properties of beliefs and intentions. The resulting framework of communicative action can be viewed as a specific elaboration of speech act theory, where the notion of illocutionary force is defined as a context-changing function. The chapter discusses a number of theoretical aspects of this model of communication and considers its implementation in the TENDUM dialogue system. The implementation suggests that notions like illocutionary force and illocutionary act are formally superfluous, but that they are useful in designing and describing utterance interpretation and generation processes.

Bookmark: E.Taylor.89.75
Title: Studying Arguments to Gain Insight into Discourse Structure
Section: Part 1: User Models and Belief Structures
Author: Cohen, R.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 5
Pages: 75-84
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: In constructing a computational model to comprehend discourse one task to be solved is how to interpret one uninterrupted set of utterances from a conversant. This paper presents an outline of a computational model to analyze arguments -- one-way dialogue, where the speaker tries to convince the hearer of a particular point of view. We then study this model to provide insight into two questions raised by this workshop: (i) what kinds of protocols to communication exist and (ii) is there a grammar for dialogues. We conclude with some comments on possible extensions to the model for the general case of two-person discourse.

Bookmark: E.Taylor.89.85
Title: The Structure of Intelligence in Dialogue
Section: Part 1: User Models and Belief Structures
Author: Edwards, J. L.
Author: Mason, J. A.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 6
Pages: 85-105
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: An approach to structuring the intelligence in dialogue involving multiple forms and modalities is described. The chapter focuses on three key aspects of intelligent dialogue as described in Edwards and Mason (1988): Control, Connectivity and Models. Links between a hypothetical dialogue and high-level goals are presented and examples of the kind of control mechanisms that would be required to manage such connectivity in the dialogue are illustrated. Models are seen as one way of helping to establish connectivity, but are treated only briefly here. An epilogue to the workshop presentation discusses Bunt's (Ch. 4) "appropriateness conditions" in the broader context of purposeful behaviour and shows how they might be represented more parsimoniously in system self models and in systems' models of the human user. Other workshop contributions are discussed as they might support structuring systems for intelligent dialogue.

Bookmark: E.Taylor.89.107
Title: Planning and Discourse
Section: Part 1: User Models and Belief Structures
Author: Shadbolt, N. R.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 7
Pages: 107-113
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Humans engaged in co-operative problem solving are able to use language to inform and co-ordinate their activity. This chapter argues that language used in this context can best be understood by describing the planning process underlying the problem solving. Interesting discourse arises from complex planners solving difficult problems. The paper discusses advances in AI Planning that may be relevant to discourse production and interpretation. It describes various discourse phenomena in terms of the properties of Planning Systems. The characteristics of discourse considered include: hierarchy, focus, topic shift, turn taking and recovery from failure.

Bookmark: E.Taylor.89.121
Title: Convention versus Intention
Section: Part 2: Discourse Structure and Processing
Author: Reichman, R.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 8
Pages: 121-134
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: This chapter was written after the workshop, as a counterpoise to some of the positions expressed. It discusses the differences between two main streams of thought in dialogue theory: speech acts and conversational analysis. The speech act description of dialogue concentrates on the intentions and beliefs of the participants, whereas conversational analysis is more linguistic and grammatical. The two approaches are orthogonal, and have different objects of study. Speech act theory deals in propositions and the beliefs of the participants in those propositions and in each other's beliefs about them (illustrated best by Perrault, Ch. 3, and Bunt, Ch. 4). Conversational analysis, on the other hand, deals in real samples of dialogue, to determine the rules people use to signal the critical events in the discourse. Several example dialogues are analysed to illustrate some of the methods of conversational analysis.

Bookmark: E.Taylor.89.135
Title: The Viability of Conversational Grammars
Section: Part 2: Discourse Structure and Processing
Author: Good, D. A.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 9
Pages: 135-144
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The idea that grammatical descriptions can be extended beyond the sentence to the level of discourse or conversation has attracted a number of researchers over the past three decades for some rather obvious reasons. As a basis for the description of discourse or conversational structure, syntactic models have seen some success, but they have not been without their detractors. In this chapter, I will briefly review the successes and criticisms of this framework, and consider the general prospects of the position. Particular attention will be paid to the nature of the "lexicon" over which such a syntax for conversation might be specified, and how the appropriate syntactic categories might be established. On the basis of these considerations, it will be argued that the only conversational grammars that can be achieved will provide little insight into either conversational structure or the processes by which conversationalists participate in any exchange.

Bookmark: E.Taylor.89.145
Title: Knowledge for Communication
Section: Part 2: Discourse Structure and Processing
Author: Airenti, G.
Author: Bara, B. G.
Author: Colombetti, M.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 10
Pages: 145-158
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The chapter presents a cognitive model of dialogue, whose main feature is the integration of the conversational and behavioral aspects of communication. This is realized through three types of knowledge structures; linguistic, conversational and behavioral games. Games are knowledge structures having two basic features. First, such knowledge structures should allow for the representation of interpersonal plans, i.e. actions plans whose constituent actions are assigned to two or more actors. Second, the very nature of communication requires that these knowledge structures are assumed to be shared, i.e. mutually believed, by the actors. Games are the basis of the inference processes involved both in generation and understanding of dialogue utterances. They implement cooperation between actors. Two different kinds of cooperation, conversational and behavioral, are analyzed.

Bookmark: E.Taylor.89.159
Title: Response Timing in Layered Protocols: A Cybernetic View of Natural Language
Section: Part 2: Discourse Structure and Processing
Author: Taylor, M. M.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 11
Pages: 159-172
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Dialogue consists of the passing of coded messages between two partners. The messages must be coded to combat possible errors due to channel distortion or to mismatch between models held by the partners. Redundancy in the code can lead to reduction in the probability of error, but only at the cost of delay in decoding. Mismatch between models can be corrected only through feedback, and delay in a feedback loop can lead to instability. Accordingly, effective coding must be done in several stages or levels of abstraction, each with the possibility of independent feedback if extraneous factors do not introduce excessive delay. Mutually agreed protocols determine both the initial coding of a message (its syntax), and the manner in which feedback is given. The performance of multi-level communication channels can be improved by using two decoding systems in parallel, one that gives fast plausible interpretations that may fit into ongoing higher-level messages, and one that uses all the redundancy in the coding to provide more slowly as accurate a interpretation as the data allows. The latter is useful especially when the higher-level messages are themselves difficult to interpret. To improve the distortion-resistance of the coding, or to circumvent limitations of the physical channels, messages may be split acros channels (diviplexed), or combined several messages to a single channel (multiplexed). The design of protocols that control diviplexing and multiplexing is at the heart of the design of intelligent interfaces.

Bookmark: E.Taylor.89.173
Title: A Generative Grammar for Local Discourse Structure
Section: Part 2: Discourse Structure and Processing
Author: Fawcett, R. P.
Author: Taylor, M. M.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 12
Pages: 173-182
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Fawcett's Local Discourse Grammar uses a formalism based on system networks and on flow charts. The chapter is in two parts: the first, written by Fawcett, introduces and outlines the structure of the grammar. The second, written by Taylor, illustrates the use of the formalisms by means of some examples. The grammar is unusual in that it provides for mutually interacting parallel pathways, and for interruption, both self-interruption by the talker and interruption by the partner. The grammar is intended for direct implementation on a computer, and could serve as part of the structure of a multi-stream interface.

Bookmark: E.Taylor.89.183
Title: Pattern Processing and Machine Intelligence Techniques for Representing Dialogues
Section: Part 2: Discourse Structure and Processing
Author: Goillau, P. J.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 13
Pages: 183-188
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: What techniques have emerged for representing patterned behaviour in dialogues? The history and scope of two contenders will be outlined. First, consider the rule-based approach which has evolved from an Artificial Intelligence philosophy. Second, consider the pattern-based approach and the development of self-learning machines from early Perceptrons to the current interest in Boltzmann machines. These approaches can be thought of as the extremes of a representational continuum, although the dialogue to be represented remains the same. An interesting philosophical paradox is raised.

Bookmark: E.Taylor.89.189
Title: The Adaptive, Dynamic and Associative Memory Model: A Possible Future Tool for Vocal Human-Computer Communication
Section: Part 2: Discourse Structure and Processing
Author: Beroule, D.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 14
Pages: 189-202
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Some limitations of a current computer-human dialogue system are examined, and partly imputed to the influence of the computer memory concept on information processing methods. An alternative memory model is propounded, which involves new mechanisms of information storage and retrieval. The implications of these mechanisms in the processing of temporal patterns are described and illustrated by a software simulation now applied to Automatic Speech Recognition. It is suggested that such a model might form the base of a future dialogue system which would be inherently robust to noise, capable of unsupervised learning, and able to coordinate in real-time many devices participating in communication.

Bookmark: E.Taylor.89.209
Title: Integrated Interfaces Based on a Theory of Context and Goal Tracking
Section: Part 3: Parallel Communication
Author: Reichman, R.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 15
Pages: 209-227
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The concept of an integrated interface depends on the notion that human and computer are both agents, and are in communication with each other. Human-human communication provides a good basis for developing a model for human-computer communication. Good communication depends on the ability of each partner to track the focus and context of the dialogue. In each phase of the dialogue the partners need to know (1) to what portion of the dialogue the current action relates, and (2) what standard methods should be used to execute or to interpret the current action. The roles of anaphora and cue words in maintaining and shifting context are discussed and illustrated. Methods suitable for use in multi-window graphics systems, where the windows support possibly related tasks, are proposed, and a sketch is presented for an architecture combining graphic and linguistic interaction.

Bookmark: E.Taylor.89.229
Title: Protocols for Group Coordination in Participant Systems
Section: Part 3: Parallel Communication
Author: Chang, E.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 16
Pages: 229-240
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: A Participant system is a computer system that facilitates the simultaneous interaction of several persons working cooperatively in solving shared intellectual tasks. What kinds of protocols would such a system require? Cantata is an experimental participant system implemented on a network of Apple Macintosh computers. Each person has a multi-window environment on a single computer. The windows support discussion on multiple topics in parallel, and each participant has the opportunity of seeing the windows of the other participants. Cantata has facilities whereby the participants can (serially) have access to shared resources, can keep historic track of the individual threads of the multiple discussions, and may indicate their attention status to a topic. Experience with the Cantata system highlights some of the problems inherent in multiple parallel communications.

Bookmark: E.Taylor.89.241
Title: Asynchronous Parallelism in the Formation of Non-Linear Phonology
Section: Part 3: Parallel Communication
Author: Edmondson, W.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 17
Pages: 241-247
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The notational formalism of non-linear phonology is introduced and its ability to express asynchronous parallelism in the phonological patterning of speech sounds is described. It is shown how the notion of segment can be dispensed with and also how the formalism can be generalized to other aspects of speech communication. One step in the argumentation is that multimodal communication (speech and gesture) can be expressed in the formalism. This leads to the conclusion that communication-like behaviour, such as found in Human-Computer Interaction, can also be expressed in the formalism. Multimodal HCI, obviously asynchronously parallel in structure, can thus be analyzed because an appropriate notational device already exists (albeit still in development) in the formalism of non-linear phonology.

Bookmark: E.Taylor.89.249
Title: Visible Speech Signals: Investigating Their Analysis, Synthesis and Perception
Section: Part 3: Parallel Communication
Author: Brooke, N. M.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 18
Pages: 249-258
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Visual cues can convey useful information about phonetic aspects of speech production. This is the basis for lip-reading as a means of improving speech perception in noisy environments, or where there is hearing impairment. In addition, in natural speech production, linguistically informative face, head and body movements frequently accompany the purely articulatory gestures. i) Videotaped recordings have been made of the faces of talkers whose heads and bodies were not unnaturally constrained. The speakers' movements were measured semi-manually and analysed in order to extract the time-varying articulatory displacements of points around the lips and jaw; the procedures also involved the computation of the time-varying movements of the whole head arising from accompanying non-articulatory gestures. Automated measurements of mouth shapes during speech production have also been using digital image-processing techniques. ii) A suite of computer program has been developed that can generate and display computer-graphics simulations of the facial movements associated with speech production. The animated outline diagrams can be displayed at rates down to or below real-time. The facial features, facial shapes and utterances to be simulated can be independently altered. The diagrams can also include features like the teeth, which may be only intermittently or partially visible. The displays were primarily designed for, and applied to, the investigation of visual and audio-visual perception. Whilst studies have so far been confined to the purely articulatory gestures, the measurement and synthesis techniques could both be adapted to investigate the non-articulatory movements associated with speech production.

Bookmark: E.Taylor.89.259
Title: Integrating Voice, Visual and Manual Transactions: Some Practical Issues from Aircrew Station Design
Section: Part 3: Parallel Communication
Author: Taylor, R. M.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 19
Pages: 259-265
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The military cockpit provides practical examples of multi-model dialogues between humans and machines involving speech. Design issues emerging from recent United Kingdom cockpit simulator studies and flight trials with interactive speech technology are described. Interactions and interference between sensory and response modalities are a primary consideration governing dialogue design and transaction structure. A short commentary is provided on some of the human engineering requirements arising from this practical work, including requirements for modality choice, prompts, feedback, dialogue and structure and transaction management.

Bookmark: E.Taylor.89.273
Title: Analyzing Conversation (in Three Languages)
Section: Part 4: Properties of Human Dialogues
Author: Taylor, I.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 20
Pages: 273-286
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Samples of conversation in three unrelated language/cultures -- English, Chinese, and Korean -- have been analyzed, using the same units (e.g., utterance, turn, segment), functional analysis (e.g., "begins a topic"), and linking relations (e.g., each utterance to its predecessor). The surface features, structures, and functions of conversation are similar in the three languages.

Bookmark: E.Taylor.89.287
Title: Speech is More Than Just an Audible Version of Text
Section: Part 4: Properties of Human Dialogues
Author: Hunt, M. J.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 21
Pages: 287-299
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Human-computer communication by text using keyboards and screens is much more firmly established than voice I/O. Partly because of this, voice tends to be regarded as an audible version of text. This view is misleading: it leads to an underestimation of both the difficulties and the potential of voice I/O. Speech encodes information on factors such as speaker identity and attitude and sentence structure in ways that have no counterpart in text. The style of language appropriate for speech is different from that appropriate for text. Speech output and input systems tend to ignore those aspects of the speech signal that are not directly represented in text, and they frequently use a style and syntax suited to text rather than to speech. In the case of speech output, the technology needed to exploit the additional aspects of the speech signal is already appearing. In the case of speech input, the technology is not yet developed, but the ability to manipulate features such as voice quality in speech output systems may be a powerful tool in learning how to exploit such features in speech input systems.

Bookmark: E.Taylor.89.301
Title: Intelligent Speech Synthesis as Part of an Integrated Speech Synthesis/Automatic Speech Recognition System
Section: Part 4: Properties of Human Dialogues
Author: Tatham, M. A. A.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 22
Pages: 301-312
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: This chapter concerns the provision of voice man-machine interface in dialogue systems such as the interactive database inquiry systems described elsewhere in this book. Now that we are moving away from the pioneering stage of the development of speech synthesis and automatic speech recognition we are beginning to revise our ideas concerning the standards of these systems. Specifically this paper outlines research toward the next generation of speech synthesiser.

Bookmark: E.Taylor.89.313
Title: Declarative Question Acts: Two Experiments on Identification
Section: Part 4: Properties of Human Dialogues
Author: Beun, R. J.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 23
Pages: 313-321
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: An empirical study is presented on the recognition of questions with a declarative sentence-type in Dutch. These questions were isolated from previously recorded dialogues and mixed with answers. In the first experiment subjects had to determine from tape the correct function (answer or question) of the isolated utterances. The function in 35% of the cases was identified correctly. A second experiment was carried out to determine which indicators play a decisive role in the responses of the subject. Therefore, possible linguistic indicators were removed from the utterances and subjects were asked to perform the same task as in the previous experiment. Here, it followed that important question-indicators were given by prosodic characteristics and pragmatic particles as 'en' ('and') at the beginning of the utterance and the word 'dus' ('so').

Bookmark: E.Taylor.89.323
Title: Computer-Human Communication
Section: Part 4: Properties of Human Dialogues
Author: Morel, M. A.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 24
Pages: 323-330
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The data for this study were collected in Paris, in a telephone Information Centre for the general public, in December, 1984. The corpus has been completely transcribed into writing. The experiment comprised three phases: I) The speaker communicated with a human operator. II) and III) The operator's voice was mechanised to give the speaker the impression of talking with a machine. In Phase III, in order to make himself better understood, the speaker was required to rephrase some of his utterances. The goal of the study is to give an account of the linguistic features found in Phase II and III compared to those of Phase I, and to show the changes taking place in the verbal behaviour of spaker when they communicate with a machine. It was possible to draw a number of conclusions: in Phase II and III the speaker pays more attention to the language; the pronunciation is better; the delivery is slower; the utterances are concise and stereotyped; there are many ellipsis; the verb is often lacking and the sentence is reduced to the constituents that are strictly necessary. But the dialogue is very stiff: people often seem embarrassed and they don't realize what has not been understood by the machine, when they are asked to rephrase their message.

Bookmark: E.Taylor.89.331
Title: Interactive Strategies for Conversational Computer Systems
Section: Part 4: Properties of Human Dialogues
Author: Waterworth, J. A.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 25
Pages: 331-340
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The first part of this chapter outlines an approach to dialogue design for vocal, interactive information systems based on an analysis of human-human conversation in contexts similar to the target application. Three behavioural studies are briefly described. These led to the identification of conversational confirmation and correction strategies. It is suggested that behavioural studies of this type are essential, as the realisation of conversational strategies varies with discourse topic. Conversational Analysis provides a methodology for identifying the way conversational functions are achieved in practice. In the second part, an experimental study investigating the effects of various methods of confirmation and correction is reported. This experiment compared transaction times for the vocal specification of varying number of items of information, with visual versus auditory feedback, with global versus piecemeal spoken data entry. The results suggest that the choice of feedback modality does not influence the efficiency of users' performance, at least on the task studied. Confirmation/correction strategy did exert a significant influence, the size of the effect varying with the number of items to be specified. Further empirical work on this topic is required.

Bookmark: E.Taylor.89.347
Title: Pragmatics in a Realization of a Dialogue Module
Section: Part 5: Applications and Architectures
Author: Siroux, J.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 26
Pages: 347-360
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: There are many possible approaches to designing a man-machine dialog system, and in particular, to designing the dialog structure and dialog facilities. For example, it is possible either to start from a priori elements, or to start from observations, if a human-human equivalent system already exists. We based the design of the dialog model used in the CADI software on this latter approach. CADI is able to manage a documentation access dialog in a speech recognition environment -- namely the KEAL system developed at the CNET laboratory in Lannion (France). This was used for a directory inquiries service. In this chapter we discuss firstly observations made regarding the real application, we then present the CADI dialog model, taking into account some details regarding its pragmatic features and indicating the effect of the speech recognition context. Finally we indicate the limits of that model and more generally the limits of an "approach by observation".

Bookmark: E.Taylor.89.361
Title: Suggestive and Corrective Answers: A Single Mechanism
Section: Part 5: Applications and Architectures
Author: Guyomard, M.
Author: Siroux, J.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 27
Pages: 361-374
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The importance of cooperative principles is now obvious for man-machine dialogue. Most of the cooperative answer types studied during the last years can be applied to oral dialogue without noteworthy changes. In a first part we briefly recall the most significant issues related to this field. In the second part we propose an improvement for Kaplan's method. The former is limited to corrective answers, which are produced when the initial question fails. Our suggestion is to extend it both by controlling the corrective answer subtlety and building a set of suggestive answers. The latter allows the dialogue to be restarted in a relevant way. Starting from the initial question, the method tries to relax progressively its constraints until suggestive answers are reached. More precisely, the initial question is broken down into weaker and weaker sub-questions.

Bookmark: E.Taylor.89.375
Title: Dialogue Supervision and Error Correction Strategy for a Spoken Human-Computer Interface
Section: Part 5: Applications and Architectures
Author: Howes, J. R.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 28
Pages: 375-384
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The aim of the project has been to simulate human pilot responses in a spoken dialogue with an air traffic controller. The emphasis has been on the 'humanness" of the dialogue and action taken by the simulated pilot. A brief outline of the dialogue supervision and error correction strategies has been given. These were designed to cater for both student and recogniser errors within the context of multiple dialogues and multiple instruction messages. Given reasonable speech recognition rates by the recogniser, it is possible to correct all of the simulated pilot errors through voice alone.

Bookmark: E.Taylor.89.385
Title: Dialogue Control in Conversational Speech Interfaces
Section: Part 5: Applications and Architectures
Author: Proctor, C.
Author: Young, S.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 29
Pages: 385-398
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: The VODIS project is focused on the area of speech-based interfaces to information systems accessible via the public telephone network. The key features of this environment are a highly restricted topic domain, goal oriented dialogue, continuous speech from a small (150 word) vocabulary, naive untrained users and highly variable recognition rates. Our dialogue design strategy has been motivated by two major requirements: (i) to provide a habitable user interface and (ii) to give direct support of the speech recognition process. Our implementation strategy has been to use the frame approach using our special purpose programming language UFL. For (i) we have been guided by two influences. Firstly, we have analyzed real data obtained from transcripts of an existing (human) information service: the British Rail Timetable Enquiry Service. Secondly, we have adopted a formal measure of "goodness" based on minimisation of transaction time. For (ii), we have developed an adaptive strategy whereby the user responses are syntactically constrained. When recognition is good, user inputs are relatively unconstrained. When performance is poor, questions to the user are made much more specific ultimately to the point where only single word answers are invited and accepted. The purpose of this chapter is to review our overall system design strategy from the point of view of dialogue control and reflect upon the lessons learnt.

Bookmark: E.Taylor.89.399
Title: Relevant Responses in Human-Computer Conversation
Section: Part 5: Applications and Architectures
Author: Vilnat, A.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 30
Pages: 399-406
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: In the general frame of question-answering systems, the dialogues must be as friendly as possible, specifically if the users are not experts of the system. In this context, we have tried to develop a cooperative system, which tries to be a real help for the user. The general purpose is to build a system where the dialogue module will be the pilot, that means that it will be the one which decides what must be done at each step because it is aware of all that has happened during the processing of the query. So the system will be able to handle an "interesting" dialogue with the user: managing the whole dialogue (not only one exchange), explaining the results, asking questions to complete the initial request and, in the future, being able to accept interruptions during the processing of the query, and to take into account both the work already accomplished and, if it is the case, any added information given by the user.

Bookmark: E.Taylor.89.407
Title: LOQUI: How Flexible Can a Formal Prototype Be?
Section: Part 5: Applications and Architectures
Author: Ostler, N. D. M.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 31
Pages: 407-416
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: LOQUI is a formal but flexible prototype, built in Prolog, of a discourse interface to a knowledge base. Three dimensions of flexibility are identified: discourse groundedness, which ranges from textual autonomy to an open knowledge base; coherence, which ranges from coded discourse rules to participant inference; and the source of ideas, which ranges from a priori to application-specific adaptive acquisition. A prototype such as LOQUI can be useful by illustrating the tractability of some aspect of the problem, by demonstrating the extensibility of the ideas, by clarifying concepts, or by providing inspiration or insight. LOQUI uses a two-level hierarchy, each level of which consists of nested exchanges within a dialogue (i.e. the main dialogue has only one level of sub-dialogue). A variety of speech acts are accommodated, and both the acts and the level structure are amenable to extension.

Bookmark: E.Taylor.89.417
Title: Architecture and Knowledge Sources in a Human-Computer Oral Dialogue System
Section: Part 5: Applications and Architectures
Author: Carbonell, N.
Author: Pierrel, J. M.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 32
Pages: 417-429
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: Our chief interest lies in human-computer task-oriented voice dialogues. Specific aspects of the system we are now developing at C.R.I.N. are surveyed here, especially its architecture and the knowledge sources used. This chapter also includes hints at the system implementation as well as a short example illustrating its functioning.

Bookmark: E.Taylor.89.435
Title: Flexibility versus Formality
Section: Part 6: Overview
Author: Taylor, M. M.
Author: Hunt, M. J.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 33
Pages: 435-453
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers
Absract: A "natural" dialogue is distinguished by its flexibility: participants do not feel constrained in what they say or do, or in how they interrupt themselves or the other partner. To design a computer-based dialogue partner that allows such flexibility, we must describe in a formal way just how it can be achieved. There are two possible ways to design constraints that are not perceived as constraints by the human user: to anticipate everything that the human will do, following a study of natural interactions, or to allow the computer to learn appropriate responses by developing models based on its experiences. The development of such models may use methods analogous to those used in speech recognition systems, which vary from formal rule-based systems to informal pattern-developing ones. To date we know neither how to design nor how to let the computer develop the full range of inter-human flexibility, and we must therefore find ways to induce the human user not to attempt fully flexible interaction. One way of doing this is to cause the computer to act in a deliberately inhuman way in some irrelevant aspect of the interaction, such as by giving it an inhuman voice or making it use excessively formal syntax. Despite claims at the workshop that a formal grammar of dialogue is impossible, two presentations pointed the way to a formal design process within which interaction flexibility may be achieved. One of these is a functional grammar of dialogue, the other a structure that suggests where a grammar is to be expected. Both demand that the computer maintain models of the user and of the user's view of itself. Both have explicit provision for interruption by the self and the other, and it is possible that they could be combined to form an integrated description of dialogue that could be used in interface design.

Bookmark: E.Taylor.89.455
Title: Dialogue with a Restricted Partner
Section: Part 6: Overview
Author: Neel, F.
Author: Waterworth, J.
Author: Howes, J.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 34
Pages: 455-466
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers

Bookmark: E.Taylor.89.467
Title: Practical Issues in Dialogue Design
Section: Part 6: Overview
Author: McCann, C. A.
Author: Edmondson, W.
Author: Moore, R. K.
Book: The Structure of Multimodal Dialogue
Editor: Taylor, M. M.
Editor: Neel, F.
Editor: Bouwhuis, D. G.
Date: 1989
Number: 35
Pages: 467-481
City: Amsterdam
Publisher: North-Holland, Elsevier Science Publishers
Copyright: © Copyright 1989 Elsevier Science Publishers