Mixed-Initiative Dialog Management for Speech-based Interaction with Graphical User Interfaces


The article explores a new approach, conversation-and-control, for using speech as input modality for controlling graphical user interfaces (GUI's), which facilitates direct manipulation of widget functions by spoken commands.


The system is useful for disabled persons with limitations in operating a mouse and keyboard.

The conversation-and-control approach has a number of advantages:
(1) The mixed-initiative dialog model facilitates direct manipulation of widgets.
(2) There is only a single command per function which reduces the average number of required commands per task.
(3) There are two methods for dealing with recognition errors: first, a set of heuristic procedures for analyzing a recognition error in order to avoid its rejection, i.e., to avoid repetitions of commands. Second, the system has the ability to ask clarification questions in ambiguous situations thereby inquiring more information from the user.

The results of user study indicated a 16.8% reduction of task completion time for the conversation-and-control approach.


Early studies in the area of speech-controlled GUIs revealed a 400% performance decrease compared to mouse control. In spite of technological advances in speech recognition systems, we still see a performance decrease of about 50%. Performance decrease for existing speech-controlled GUIs can be caused by: the high number of commands per task in average and recognition errors that lead to rejections or misunderstandings of spoken commands, which force users to repeat commands.

Users experienced precision problems with speech-controlled mouse: after saying "stop" the cursor kept moving until the speech recognizer finished processing speech.