Consistency and Symmetry—and Other Guiding Principles for Designing Speech Applications
Designing dialogs for verbal user-computer interactions is still an art as much as it is a science. Speech application developers iteratively test applications to refine dialogs to minimize communication breakdowns, decrease the time for callers to complete tasks, and improve the naturalness of the conversation.
From the limited experience to date, some guiding principles for designing dialogs have become apparent. These principles are used to generate human factor guidelines for verbal user interfaces. While guidelines often serve as a checklist of what to do and what not to do in a verbal user interface, guiding principles motivate and encourage good dialog designs.
Two early guiding principles have emerged: consistency and symmetry. Presentation consistency refers to the similar structure and format of prompts presented to the caller. Response consistency refers the similar phrasing of words spoken by the caller in response to the prompts. Symmetry is a type of consistency between prompts and responses, in which callers mimic or parrot the structure, format, and words they hear in the prompt.
Unlike creative writing, in which consistency can sometimes be boring, presentation consistency enables callers to predict what they will hear, so they can be ready to respond appropriately. More experienced callers can bypass much of the prompt to accelerate the dialog. Here are two examples of presentation consistency.
Use parallel syntactic constructs in prompts. It is easier for callers to select words from a menu when the menu's options are presented consistently. It is easier for callers to provide values for multiple fields in a form if the prompts are presented consistently. For example, it may not be a good idea to use the following inconsistent syntactic:
Say your name.
What is your birth date?
Say your street address.
In which city do you live?
Please enter your Zip code now.
Instead, use parallel syntactic constructs such as the following:
Your birth date?
Your street address?
Your Zip code?
Use a consistent format for a prompt in a menu. Presentation consistency enables callers to predict what will be spoken and how to react proactively. Using a consistent format for prompts also will help callers select options more quickly. Structure all menu prompts so that experienced callers can barge in (interrupt the voice menu options) as soon as they hear the menu name, average callers can barge in after they hear a question, and novice callers respond after they hear the menu option list. A menu prompt should consist of the following sequence:
Speak the menu name. For important menus, the dialog designer should include the menu name in the menu's prompt. The menu name serves as a landmark. A landmark is a speech or non-speech cue that marks a specific location within the dialog structure. By providing the menu name, such as "main menu" or "thermostat," callers can jump to this menu by speaking the menu name, or they can return to this menu if they get confused or lost. Also, repeating the menu name to the caller confirms that the caller has reached the correct menu.
Ask a question. Often, this can be achieved with two or three words. This should be enough for experienced callers to say one of the commands without listening to the enumerated options. Novice callers will listen to the enumerated options before speaking their selection.
Enumerate options. List the options so novice callers can hear and select their desired options.
(This is the name of the menu. Expert callers can barge in here because they know the question and the allowable options.)
Say the color you want.
(An average caller can listen to the question and barge in if they know the answer.)
Green, red, or blue?
(A novice caller listens to all of the menu options before responding.)
Other types of presentation consistency include consistent formats for help and error messages, formats for items in a form, validation of the caller's input, and the use of audio icons to indicate that "it is the caller's turn to speak," "the computer is working," and "this is a hyperlink that can be invoked by speaking its name."
If the style and format of dialogs are consistent, then an application's persona (the personality of the application) will emerge as consistent and helpful rather than random and confused.