Home > Articles > Networking

  • Print
  • + Share This
This chapter is from the book

Writing Prompts for Elegance, Speed, and Value

The difference between an adequate speech-recognition system and a great speech-recognition system lies in how the system asks questions and conveys complex information. Great systems do it with an elegance worthy of Audrey Hepburn; their meaning and impact are clear and immediate, and not a single word is wasted. The more elegant a system is, the more intuitively—and quickly—a caller can use it, and the greater value it offers to both clients and callers.

There are several mistakes commonly made that negatively affect the elegance, speed, and value of the system: providing unnecessary information, using ambiguous language, and not getting callers to focus on the essentials.

Unnecessary Information

Too often I hear systems that provide all callers with information that might only be useful to a small percentage of them. For example, I heard a cautionary message for U.S.-based services (used predominantly by U.S. residents who only make calls within the country) that provided information useful to only a small population. With some editing to protect the anonymity of the company, the prompt said something very similar to: "There could be a charge associated with using an access number if you call from the U.S. to another country such as, but not limited to, the former Soviet Union, Croatia, and Albania. Now, please tell me name of the city and state in which you live." Of course, sometimes disclaimers such as this are required by law, but they should only be played when necessary—not to every person who calls in. Not only is it long and distracting, but it also adds extra time to the call—and that means higher costs for the company paying for the toll-free line. Also, the disclaimer won't be effective if people are not going to listen to it, so it's in the best interest of the company to allow the designer to work with its legal team to aid in the writing of any required prompts so that the prompts are concise and intelligible while providing value to the listener and covering the company's liability.

Ambiguity in Language

In today's fast-paced world, we all use ambiguous or imprecise language as a kind of shorthand to save time and effort. Imprecise language causes the least negative impact in situations where a context can be clearly established and there is a high bandwidth of information exchange. An example is in face-to-face situations with people we know, where prior knowledge, body language, and vocal cues can fill in the blanks and add meaning to the words.

But it's a different story when one or more of those additional communication components—prior knowledge, body language, and vocal cues—are absent. The most extreme example is e-mail from an unknown person, where all three are absent. As every e-mailer knows, the intended meaning of written messages can easily be lost or misconstrued—particularly sarcasm and other forms of humor—which can lead to misunderstandings. Many e-mailers try to avoid this problem by adding emoticons and acronyms to their messages to help clarify their meaning:

"I know we hung out this morning but it feels like forever. :-) "

—where the set of colon, dash, and close parenthesis look like a smiley face turned 90 degrees counterclockwise.

Speech-recognition systems have the tremendous advantage of audio—which can include vocal cues, sound effects, and music—to add meaning to their outgoing communications. But because speech systems require clear answers from callers in order to function properly, it's essential that all prompts be as concise and unambiguous as possible.

That means we must carefully consider the language we use to convey ideas—even to the point of avoiding language that not only could be misconstrued, but even misheard. We can examine this situation by examining how people talk to each other. I was at a family gathering recently and overheard the following exchange among my Uncle Rob, Aunt Bobbie, and 22-year-old cousin Paul.


Hey, Bobbie—I just found out Paul isn't dating a woman six years older than him—she's only six days older than him.


Well, that's certainly a relief.


Uh, guys? You're both wrong. She's actually six months older than I am.

The misunderstanding arose because my uncle misheard the age difference, only remembering the "six." Had my cousin said, "Yeah, she's half a year older than I am," there would have been less chance for confusion (and alarm on my aunt's part!). Using the phrase "half a year" instead of "six months" would limit the number of ways a listener could reasonably mishear the sentence. And none of those possible misheard phrases ("half a day," "half a week," or "half a year") would likely cause alarm—even to my Aunt Bobbie.

We can do the same thing when we write prompts for a speech-recognition system by understanding—and avoiding—the ambiguities of real-world conversations, and steering clear of language that could be misheard or misconstrued. Some systems try to avoid errors by repeating information back to the caller. This is a good idea when an answer needs to be precise—such as a stock trade—but unnecessary when a precise answer is not essential or can be easily modified later on.

  • + Share This
  • 🔖 Save To Your Account