Designing Voice User Interfaces vs. Chatbots

Bots, regardless of whether it talks back to you or texts back to you, should provide a consistently pleasant experience that helps you get tasks done. However, designing these two types of bots, known as voice user interfaces or chatbots respectively, require different strategies. In this article, we have broken down the different factors that must be considered to create a seamless user experience.



  1. Local Lingo

For both voice and text, you must account for the vocabulary used by your intended users. For example, Australian users may greet using “mate” while North American users will either omit addressing the bot in any way or say “dude” in a casual setting.


  1. Pronunciations

With voice, you have to account for different pronunciations that Amazon Alexa or Google Home may or may not pick up. We ran into this issue while designing a bank bot for customers of Mexican banks.

Alexa had trouble picking up terms like CEDES, a common acronym in Mexico to note certificates of deposit, even with an American pronunciation, Spelling out C-E-D-E-S would fail often as well.

After bouncing around a couple ideas, we had Alexa call it “certificates of deposit” while offering an investment update on it, so that the user would also ask for it using this term.

2. Visual Aids (Pictures, Videos, GIFs)

You don’t realize how much of a luxury it is to be able to use pictures and videos until you start designing voice bots.

Imagine this: someone is in Barcelona for a business trip, and they ask the Alexa in their office to tell them the best cafe to meet a client for coffee. To account for how long it would take for them to get to their next meeting across town, they ask how close the cafe is to their next meeting spot.

If they were using a web chatbot, it could show them directions from point A to B in Google Maps.


If they ask Alexa, the best solution may be to tell them the estimated travel time, and offer to send directions to their phone. Alexa telling them to make a left turn at which street would be useless as they wouldn’t retain any of that information.

Here, the designer’s job is to determine the most pertinent information to provide through speech, and the best way to communicate other necessary information.

3. Dialogue Placement

This one’s the most intuitive. When you say what matters for both mediums, but for voice, it requires slightly more effort from a user to ask for the information again if they didn’t understand it.

If someone chatting with a text-based bot didn’t understand something, they can go back and re-read what the bot said without asking the bot to repeat it. For voice bots, they could ask Alexa to repeat, but it’s not as much of an automatic reaction as moving their eyeballs!

4. Bot Personality

Both voice and text must consider certain parts of the bot persona such as level of formality and the kind of vocabulary (layman’s terms or industry lingo?)

However, tone is important for voice bots in more than one aspect. Aside from the formality, the pitch and volume determine the impression users will have.

Voice actors are a great option because:

  1. People get sick of hearing Alexa’s voice over and over again for different skills we use.

  2. The voice can be customized to align with the brand image.

For example, a women’s clothing brand focused on colorful, feminine styles could create a friendly female personal shopper with a slightly-higher pitch than the standard Alexa voice. They can match the tone with how their store clerks or customer service representatives on the phone would address customers.

5. Time to Completion

With voice, you have to be more descriptive in a shorter amount of time.

The challenge: voice has less time to get across more information because you can’t rely on visuals available for a text-enabled bot.

This is where learning to write succinctly will help. Many concepts of good writing applies to writing scripts for voice assistants.

This article on writing for impact applies to voice user interfaces:

  1. Use fewer, stronger adjectives.

2. Write like you’re talking to a friend.

Regardless of whether your bot persona is casual or formal, you should first practice by writing in a casual tone — like you’re talking to a friend.

We’re often much more straight to the point when we talk to a friend because we’re more comfortable than talking to, let’s say, our managers.

It’s easier and more effective to change your script from a casual to formal tone, not the other way around, because this allows one to write without filtering for appropriateness. Later, one can make changes to the style of speech while retaining important points.

When designing an enterprise bot that requires more formal language, simply adjust the language during the refinement stage.

6. NLP-training to understand different speaking styles

Because your users are humans, not bots. As humans, we sometimes forget what something’s called, say fillers like “um” and talk in a round-about way. It’s worth training the bot to understand users who say these fillers before words that people typically stumble over. Using tools such as Microsoft LUISand Google Dialogflow, you can also train the bot to respond to different ways someone may ask for the same thing.

Customer satisfaction can be achieved when a bot strategy is carefully developed. At Wizeline, we define chatbot objectives, identify the software integrations that work best, build types of experience, and design intents and flows. After validating the experience of using the chatbot with clients, we get them involved in testing to check if we need to add NLP, edit copy of the messages, shift actions when certain messages are displayed, etc. We also make sure to determine a marketing strategy to launch the bot so that it can be discoverable. Talk to us here.