Evaluating The Amazon Lex Version 2 Chatbot Framework
What Changed In Version Two & How Can Lex Improve
Introduction
Amazon Lex V2 did not feature in the CDI Conversational AI Platforms 2021 Vendor Assessment.
Looking at the Gartner Magic Quadrant for Enterprise Conversational AI Platforms Report, Amazon Lex V2 (Lex) is shown as a niche player, and does not fall into the category of challenger, leader or visionary.
Further, Lex lacks in ability to execute and in completeness of vision. Looking at what Lex has to offer, this description is very accurate.
AWS has an array of related services which can support Lex, these are Amazon Connect, Amazon Kendra, Amazon Polly, Amazon Translate, Amazon Comprehend, Amazon CloudWatch, etc.
Amazon Echo focusses on voice via a dedicated speech input device, whilst Lex is more conversational focussed, with voice extensions (TTS, STT) and integration to channels.
Available Mediums/channels for deployment are few in number and pales in comparison to platforms like Microsoft, IBM Watson Assistant, Cognigy, etc.
Observations & Cautions Regarding Lex
- Lex will most probably not be a go-to Conversational AI solution in of itself. But rather existing products in AWS, needing to extend into a Conversational AI interface, will make use of it from a cost and convenience perspective.
- Gartner listed AWS innovation is a strength in opting for Lex, together with geographic presence and regional availability zones. The inherent lack of conversation and development affordances within Lex is hard to remedy by AWS’ prowess as a cloud entity.
- Geographic and regional availability zones are common with cloud solutions and many of the Gartner leaders offer private cloud installations.
- With AWS’ viability, global presence, reputation, market visibility etc, it is hard to understand why more has not been done to improve Lex.
- Lex does not have equivalent capability compared to most conversational platforms.
- Lex might look like a no-code environment, but to scale dialog flow and state management, it will have to be incorporated in a pro-code Lambda functions; or something equivalent.
- Language options for Lex are limited and no universal language option is available.
- Lex has no dialog management environment, fine-tuning is non-existent. Elements like digression, disambiguation menus. Scaling any implementation of Lex will be difficult.
- One cannot but help get the feeling that Lex is a poor subset of Amazon Alexa features repurposed for a chatbot.
- With the strong focus on slots (analogous to entities) and four design elements dedicated to it (as seen below), it seems like Lex has a e-commerce fulfilment focus more than customer care.
Amazon Lex Architecture Breakdown
The lex development and management console consist of version management, languages (though limited), Intents & slots (slots are analogous to entities), and managing deployments.
Existing affordances for conversational development are intents, context (more about this later) and slots.
Confirmation prompts, decline responses, fulfilment and closing response are all affordances to ensure slots/entities are captured and confirmed correct.
The one element is missing is dialog state/flow development and management. In this sense, Lex reminds quite a bit of Google Dialogflow ES.
January 2020 I wrote the following on Google Dialogflow ES:
Managing the dialog will become complex and an issue, if you are using Dialogflow for Google Assistant it will suffice, and it is actually very convenient. However, the moment you start building more complex dialog structures, you will have to facilitate state management and context management via another environment.
This holds true for Lex, and there are two possible approaches for future iterations of Lex:
- Pro-code: Position Lex as a pro-code dialog state development and management framework; with guided steps on how to leverage Lambda functions for advanced dialog fine-tuning like digression, disambiguation, compound intents, etc.
- No-Code: Create a dialog design and development canvas for highly scaleable conversational experiences. Like Dialogflow ES graduated to Dialogflow CX.
An Amazon Lex bot is really constituted by a collection of intents, and the conversation is structured around intents. This is a far cry from environments where multiple intents are services in a single flow.
A framework like Kore AI has something they call traits, which is an overarching concept, and could be described as a theme of a conversation, or section of a conversation.
Dialog Design & Development
As intents and sequences of slot filling are defined, this view of the conversational flow is built.
It is intuitive to want to click on the speech bubbles to change words or messages, or move drag and drop conversational elements. This is not possible though. The Conversational flow is not an interactive environment, but a result or representation of the intents and slots defined.
As seen below, the ordering of slots plays a crucial role in how the conversation flows and sequence.
This comes back to the idea that Lex is focussed on fulfilment of online orders and e-commerce. And not a conversational experience builder per se.
Intent Detection
Lex has an option to automatically generate intents from conversational transcripts which are uploaded. Currently only English is available. This is analogous to functionality in LLM co:here, Nuance Mix and a tool like HumanFirst.
The challenge with the approach Lex has, is that you need to provide 10,000 lines of transcripts, which is enormous. Apart from that, a CSV or sentence based text file will not suffice. A specific JSON format is mandated.
All of these are debilitating compared to the open approach other platforms take to this feature.
Conclusion
- There is a big need for automation of voice calls to customer support call centres, and these are not focussed on specific speech input devices like Alexa. Google Dialogflow CX is a case in point.
- Conversations in general, and chatbot conversations in specific have moved past the command-and-control scenario and intuitive, multi-turn conversations are taking place, which demands environments wich are scaling well, with fine-tuning.
- Intents are being enriched and augmented with traits, hierarchy, nested-intents, follow-up and confirmation intents and the like. Lex has no structure added to intents.
- Single intents are used to branch off into different conversations.
- Entities are complex conversational elements. Other frameworks are adding structure to entities, machine learning entities, nested entities and more. This is lacking with Lex.
- Frameworks are placing much emphasis on web chat, especially IBM Watson Assistant.