Photo by T Z P on Unsplash

Key Considerations In Designing A Conversational User Interface

Start Here If You are Thinking Of Creating A Chatbot

How Do We Humans Do It?

The conversation you are having with a computer must not feel weird or awkward.

But also, it must not disrupt the patterns of human behavior which have evolved over time.

It is experience, rather than understanding, that influences behavior ~ Marshall McLuhan.

Instead, your conversational interface must adapt to the way of communication we all use and know the best.

We find conversation in general very intuitive and frictionless; hence conversational interfaces must follow suit.

Be Mindful Of Technical Impediments

In most respects Conversational AI and existing software frameworks are inferior to what we as humans are capable of. These Technical limitations should be catered for during your design and build phase. For instance, human conversations do not come to an abrupt end and then terminated, due to a unrecoverable system or dialog error.

Some Development Frameworks Can Accommodate Digression. Others Not.

In reality, we as humans don’t abruptly reply with “I cannot help you with that” and dismiss the conversation. Or…at least we aught not.

Hence, your conversational interface should neither.

As humans we try and explore related topics or ideas during a discourse, in an attempt to detect intent and a common ground of sorts. Your software should try and emulate this as far as possible.

Play To your Technical Strengths

The inverse is also true, there are specific instances where computers exceed our human cognitive capabilities.

Sourced From Google: Different Modalities Present Different Strengths And Affordances

If I look at our children interact with Google Home and Alexa:

  • Conversational User Interfaces do not get annoyed and irritated by repetitive and simple questions.
  • They are not offended by receiving commands and requests all the time.
  • The conversation from the device or interface does not have all the filler sounds of “uhm”, and “aaaah” etc.
  • Information is imparted quickly and fairly accurate.

If you streamline the script of the Conversational Interface, you will find many opportunities to avoid user annoyance.


When we take turns to speak, also referred to as dialog turns, interrupting each other is avoided and the conversation is generally synchronized. This is our way as humans to manage the state of the conversation. As humans we do this intuitively and effortlessly.

Amazon Echo Alexa: The Light Acts As A Conversational Cue For The User

Google, in their Voice User Interface design principles, describe it as follows:

Turn-taking is about who “has the mic”: taking the mic, holding the mic, and handing it over to another speaker. To manage this complex process, we rely on a rich inventory of cues embedded in sentence structure, intonation, eye gaze, and body language.

Unique Voices Facilitates Creation Of A Persona

Your Conversational Interface will not have access to all these rich human nuances and cues, but there are elements which can be employed. For instance, silence from the user usually indicates a readiness to see the dialog turn.

Within the chatbot or voicebot script, you can use syntax and/or tone to signal to the user the interface is ready to receive input. Let your interface ask a question; this is the clearest way to signal dialog turn in a natural way.

Why A Persona?

You must see a persona as a design tool, and this tool assists in the writing of a conversation. Prior to start writing the dialog, you need to have a fairly complete understanding of who is communicating to the user.

Different Languages For A Locale Independent Conversational Design

What constitutes a persona? A few elements are used, like tone, script, personality and you should know what your personal will do or say in any particular conversational situation.

Weather you like it or not, users are going to project a persona with regards to your interface.


Advances in automatic speech recognition (ASR) means that we almost always know exactly what users said. ASR can detect spoken word better now than what we as humans can. Speech recognition is not the challenge. The challenge is understanding, extracting meaning and intent and conversational entities.

In isolation user utterances are hard to understand, in context, it becomes easier. We as humans, when struggling to understand someone’s intent, would say: “give me some more context” or, “but in what context”.

Follow-up Intents

Your conversational interface needs to keep track of context in order to understand follow-up intents.

Follow-Up Mode Avails User Freedom

Unless the user changes the subject, we can assume that the thread of conversation continues. This allows for follow-up intents to be detected with greater easy in the customer conversation.

One of the harder things in writing a conversational interface is making provision for digression.

This is when the user moves from one context or conversational thread to another. Read more about regression here.


A lack of variation makes the interaction feel monotonous or robotic. It might takes some programmatical effort to introduce variation, but it is important.

Many development frameworks have functionality which allows you to easily randomize your bot’s output. Or at least have a sequence of utterances which breaks any monotony.

Users Are Generally Informative

Users are usually very cooperative. And this being a conversational interface, user’s are bound to supply more information than what you might expect. This will necessitate you to handle quite verbose user dialogs. Especially at the start of the conversation.

You can mitigate this risk by adding an initial high-level first NLP pass. You can read more about this here.

From here you will want to detect intent, there might be multiple intents you need to tackle.

And then there are the entities; for instance, in the case of a travelbot, entities will be cities of departure and arrival, dates, times, airlines etc.

Advanced entity detection is a must. Read more about this here.

Keep The Dialog On Track

Your conversational UI will be domain specific, hence you will need to manage the dialog in a subtle way to ensure users understand the purpose and aim of the interface. You might not always be able to handle all cooperative responses from a user. But you should always be able to use lightweight and conversational exception handling to get the dialog back on track in a way that doesn’t draw attention to the error.

Telephone Based Conversational Interface ~ VoiceBot

Move The Conversation Forward

We all had conversation with bots which are sticky, repetitive, rude or plain unhelpful. You expect your user to be cooperative and informative, and your bot must be the same. Always sharing a dialog which is intended and helpful in moving the conversation forward and to conclusion.

Stick To Your Domain

In any conversation, saying too little or too much are equally uncooperative. You must try and facilitate your bot’s comprehension by trying, via the script, to keep the user’s response brief and concise. With optimal relevance to the current context.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Cobus Greyling

Cobus Greyling

Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; NLP/NLU/LLM, Chat/Voicebots, CCAI.