Wizeline’s Conversation Designer takes us through designing human-machine interactions based on human-human interviews.
“For voice recognition technology, grasping all the necessary contextual factors and assumptions in this brief exchange is next to impossible.” Ditte Mortensen, UX Researcher
Why start off with this dire statement?
This sets the expectation that designing bots requires you to embrace the pursuit, but not the expectation, of perfection. Because bots cannot retain context like humans naturally can, we must ask the right questions at the right time and be transparent in why we’re doing so.
Those designing the conversations we can have with a voice interface strive for:
Making conversations sound natural
Enabling users to complete an action with the fewest number of steps, and therefore, as little friction as possible
Asking users the right information and avoid assumptions to provide customized, relevant information.
With these goals in mind, here is my check-and-balance system for designing bots:
Use Time Wisely.
App builders fight for space on mobile. Alexa skill builders fight for time-- people’s attention span listening to a prompt.
This goes for both text-based and voice user interfaces. The same way space is of the utmost importance in text bots, time makes or breaks a voice experience. While we try to mimic natural conversations, we do have to keep in mind that we have less patience with bots than with human interactions.
When we see that someone is nervous, we can feel empathy for them and tell ourselves to be a bit more patient with trying to get an answer out of them. We don’t have the same lenience with technology devices.
So it must get to the point right away, and anecdotes that show personality need to be used sparingly to serve the purpose of the bot.
It’s the same when talking with humans, right?
There are expectations in back-and-forth conversations with humans we apply to bots as well. For example, we don’t appreciate customer service representatives that go on and on about their personal lives, what they had for dinner, etc. when all you want to do is get your router fixed. This applies to machine interactions as well, perhaps even more so. Make it clear what a user can do through the interface, provide action items, then confirm before the bot takes any decisive action.
2. Take advantage of sounds and conversation markers.
In some ways, designing voice can feel easier than designing for text. Designers can take advantage of the fact that they are designing for audio and the UI is simply words. Of course, there are separate factors within delivering these words: speech breaks, tone, pitch, volume, female vs. male, etc. I outline it in detail here, but here’s an example:
After a lengthy sentence, put in breaks so you give user a bit more time to comprehend what they were told or what they were asked to do.
Ex. If someone asks where they can find the verification number for their credit card to confirm a purchase, the bot must provide instructions at a speed at which the user would perform the action.
Wouldn’t you do the same when you explain a step-by-step process to someone?
3.Guide the user and ask for what you need-- but you must justify it.
This is where a designer’s role is important to take advantage of all capabilities with the least impact from limitations, such as retaining context of the conversation.
For a user asking for restaurant recommendations, the conversation may go something like this:
Here’s where the bot could run into trouble. At this point, the user may change the topic by asking how the bot makes these recommendations:
The user now has their question answered about how the bot works, so she wants to go back to getting food recommendations. She wants to see what else is around, so she now asks,
“Well, how about Mexican restaurants?”
Uh-oh, here’s where the trouble begins. The Alexa skill did not store the fact that the person is currently in San Francisco, so it’ll ask,
“Got it, what city should I look for Mexican places in?”
The user will think:
……. I just told you.
Instead, to justify why they’re asking for this and show transparency, the bot can give a quick explanation.
Yes, it’s not ideal. But it shows transparency and makes your questions reasonable.
We apply these principles in designing bots because an automated system, whichever medium it’s on, needs to earn the right to be wrong.
The bot has earned this right if it has tried to clarify and ask relevant questions. It does not have the right to be wrong if it never asked the questions to help the user and it proceeds to expect the user to be understanding of repetitive questions without an explanation of how the answers to these questions will be used.
Maximizing shared knowledge is key to creating better conversations.
Without this mutual understanding, each conversation we delve into with machine creates more opportunities for errors that waste our time. In the end, we as users will have to dig ourselves out of the dark hole we’ve gone into because the right questions were not asked and thus, wrong information was provided. As bot designers, by putting ourselves in the user’s shoes with each iteration, we can design bots that serve our needs with a higher success rate each time.
“Some of these companies need to focus at least as much on testing their products as they do drafting press releases.”
Thomas Gouritin calls the lack of focus on user experience the “AI bulls***” — specifically getting one error message after another for tasks that logically should work.
His theory for this widespread problem?
Not enough focus on testing the product before launch, and way too much on the selling. Specifically, those press releases that tout their Artificial Intelligence-powered chatbots(capitalized, mind you) that in reality are programmed questions and answers.
His commentary is caustic, but accurate.
Reserving enough time to test is beyond valuable. It is essential to avoid becoming a screenshot of a failed bot on Twitter.
The goal is to account for both predictable and unpredictable errors. For the latter, the bot does not need to have a solution, but needs to route the user to a place or person with the solution.
1. Predictable Errors
Test ideal user paths and account for unforeseen edge cases to determine impact vs. effort of changes.
Happy paths are ones in which the user is able to complete a task smoothly by providing expected answers. These paths must absolutely work, because the bot is otherwise useless.
In edge cases, a user types or says something the bot was not designed to answer. For edge cases that are discovered during usability testing, we need to determine the value it provides vs. time and effort it takes that may compromise timely delivery.
Ex) When the bot asks the user what type of cuisine they’re looking for, the user may ask for restaurants with the least wait time. Here, the bot should tell the user that it can only give recommendations based on cuisine preference.
The user may misunderstand your bot as broken if you don’t provide an explanation that the feature is not supported at this time, as well as a call-to-action. However, in order tell them why something isn’t working, the bot must be trained to pick up why the error is happening. This is why training the bot to understand common user inputs is important for the overall user experience.
Otherwise, they’ll keep trying the same thing over and over again.
The novelty surrounding chatbots makes people’s expectation of chatbots much greater than what most bots can handle at the moment — and I’ve seen this first-hand in testing.
The point of usability testing is to gather insights, prioritize, then iterate; it is not to make changes to account for every finding.
2. Unpredictable Errors
Reserve time for these to arise, then prioritize again.
Then there are errors that we simply can’t predict, such as platform issues. The restaurant recommendation bot may experience issues pulling up database from Yelp and return an error message. Testing should ideally be done in a specified time window, after which you prioritize solving for usability issues over nice-to-have features.
How do you train the bot to respond to these errors?
Organize a spreadsheet that categorizes each intent* expressed by the user that the bot should understand, then gather a list of common phrases that they would type or say for each intent. One intent could be “unsupported feature,” under which you’d put common features requested, but unavailable in the bot. You can then train the bot to respond with a copy explaining that the feature is not supported at this time.
Example of a spreadsheet categorizing user intents and corresponding phrases
*Intents: what we determine the user is requesting when they say something to the bot
How to get the right help for testing
Allow each team member to use their domain expertise and get involved in the right steps, so that you can create a prototype, deploy to a testing environment, test specific user flows, then iterate. Though you should leave time for design changes, accounting for different scenarios as much as possible from the start will cut the time required to iterate.
Let’s take this scenario: when a user types “help” while shopping for shoes, what options should the bot offer? Are they asking for “help” with navigating the bot or with the item they’re looking at, thus asking for a customer service rep?
The NLP Trainer (link to Aldo interview) would work with the Bot Designer (link to Diana interview) to come up with possible solutions:
Direct user to menu options (one of which is talking to a human agent)
Connect user to a human agent directly
Ask what exactly user needs help with, then redirect
As you may have guessed, I usually go with the third option to avoid assuming the user’s intention.
Getting the answer to “what exactly are they likely asking for?” correct as often as possible — through usability testing and analytics — makes the bot “sticky,” encouraging users to use the bot again and again.
Like any good design process, the decisions must be collaborative and iterative.
In conclusion, you need to consider:
1. Realistic Users
Testing with the intended users informs designers of tweaks, and sometimes entire redesigns, that need to be made.
This includes users in the worst case scenario.
For chatbots, users in the worst case scenario would actually be not people who have never used the chatbot, but those who have used it andhated it.
We want to observe how those who are biased from previous experiences and those who are brand-new would use the bot.
The tone should be in line with how the user speaks, and avoid information fatigue.
For text-based chatbots, the designer’s job is to ensure that text fatigue doesn’t hinder task completion.
As UX Designer Eunji Seo says, “Don’t make users go TL;DR.” Generally, anything more than three lines of text is too long.
3. Handling Errors
The goal is to have as few fallback messages (“Oops, I didn’t get that!”)as possible.
As mentioned above, this requires the designer to adjust wording or change the order of messages so the conversation feels natural and helps users achieve the task quickly.
*Fallback messages: messages noting the request can’t be understood, then often followed by menu options
4. Task Completion Rate
One way to test fatigue is through the 60-second test.
“Can users perform a certain number of tasks with just one hand in under 60 seconds?”
Customer satisfaction can be achieved when a bot strategy is carefully developed. At Wizeline, we define chatbot objectives, identify the software integrations that work best, build types of experience, and design intents and flows. After validating the experience of using the chatbot with clients, we get them involved in testing to check if we need to add NLP, edit copy of the messages, shift actions when certain messages are displayed, etc. We also make sure to determine a marketing strategy to launch the bot so that it can be discoverable. Talk to us here.