An Updated Matrix Of Conversational AI Technologies
The conversational AI landscape is becoming more complex, especially with the advent of voicebots and implementations which demands orchestration between multiple skills, etc. There are also immense focus on presets, bootstrapping, industry specific templates and the like. This post attempts to categorise products, platforms and technologies to create a simplistic overview.
Introduction
Here you will find an updated matrix of Conversational AI technologies. Compiling a comparison matrix of technologies are becoming increasingly difficult, due to the ever increasing complexity of the Language Technology landscape.
Hence I updated this matrix where similar technologies are categorised. In a follow-up post I would like to dive into the five categories of LLM’s. These categories can also be seen as implementation types.
Any sufficiently advanced technology is indistinguishable from magic.”
~ Arthur C. Clarke
It needs to be mentioned, with the advent and proliferation of voicebots, there has been a disruption in the market. This is due to the fact that voicebots are more complex, demand higher accuracy from each step of the conversational process, and employs highly specialised technologies like TTS, STT, neural voices, etc.
Conversational AI — Category 1
Category 1 is dominated by open source, more technical NLP tooling and chatbot development frameworks. It needs to be mentioned that NVIDIA Riva consists of TTS, STT and NLP/U but does not have a homegrown dialog state management systems. The Riva demos leverage Google Dialogflow and Rasa Machine Learning stories. spaCy is a NLP tool, but can be extremely helpful in chatbot development. Cisco MindMeld is a complete chatbot development environment and quite a few new features have been added to their offering. Lastly, by just considering the methods employed by DeepPavlov, many helpful principles can be gleaned.
In general, catergory 1 offerings can be installed anywhere, has an open architecture, open source, no or limited GUI, configuration file and pro-code focused. higher barrier to entry and scales well. They demand astute technical planning for installation & operational management. New features can be developed and the platform enhanced.
“The means of learning are abundant, the desire to learn is scarce.”
~ Naval Ravikant
Conversational AI — Category 2
Category 2 use to be the safe-bets, the de facto standard in terms of chatbot development and the market leaders. This has changed, and clearly illustrated in the Gartner Magic Quadrant report of Conversational AI. The category 3 platforms have largely taken the lead in innovation. Especially in the areas of intent structure and management, building structure into intents, dialog flow development, etc.
Some of these frameworks are self contained in terms of voicebot development, like Microsoft, Watson Assistant and Nuance Mix.
It needs to be mentioned that Microsoft has done stellar work in terms of their language technologies and speech enablement.
Hallmarks of category 3 platforms are that they are often used by large-scale commercial offerings, cloud based, big tech companies, in some instances specific geographic regions can be selected. They are seen as safe bets for large organisations. Solutions range from pro-code, low-code to no-code with a lower barrier to entry. little to no insight or control as to what happens under the hood and little to no user influence on the product roadmap.
Conversational AI — Category 3
In category 3, It really feels that this list is ever expanding with products and platforms. But it needs to be noted there are six platforms in the top quadrant of Gartner: Kore AI, Cognigy, Omilia, Amelia, OneReach AI and IBM. Five of these platforms I’ll put into category 3, and it seems like it’s here where the innovation is focussed.
The platforms here are mainly independent, alternatives to the establishment, providing an encapsulated platform. The enabling technology under the hood is often not made known. Often innovative approaches are followed to the challenges of Dialog State Design, development and management with low-code to no-code approaches. There is a possibility of these companies being acquired, price is often more negotiable and feature requests are more likely to be accommodated with a lower barrier to entry to get going.
Subscribe and get an email every-time I write a new article! 🙂
Conversational AI — Category 4
I wrote previously on the fragmentation of Conversational AI implementations due to the advent of voicebots and also of implementations becoming more complex, with single vendor platforms unable to accommodate all the niche requirements. The emerging vectors in Conversational AI opens up opportunities for these service providers within the conversational AI landscape.
These products are often Natural Language Processing and Understanding tools where text or conversations can be analysed for intent, named entities, custom defined entities, etc.
Data annotation and training data improvement GUI tools are available in some cases, including tools for managing training data. Features not include are typical dialog state management, chatbot response management etc, with focus on wider language processing implementations and not just conversational agents. And often used for non-real-time, off-line conversational text processing.
Conversational AI — Category 5
Large Language Models (LLM’s)
The list of LLM’s listed here are the main commercial offerings. They are alike in their approach, with very similar looking playgrounds. And from the playground environment code can be exported for the next step, which is usually a notebook.
There are a number of LLM’s which are being open-sourced. And in the case of Goose AI, much more affordable alternatives are being offered. On 1 September 2022 OpenAI is following suit by dropping their prices considerably.
Apart from these platforms, a number of LLM’s have been open-sourced and made available to the public. This not only democratises access to LLM’s, but it deprecates the idea of only a few holding the keys to LLM’s.
Conclusion
In a follow-up post I would love to segment LLM’s further into five sub categories of implementation.
Technologies which I can add in a follow-up post are messaging and CX platforms that also bring their own bot builder or allow for integration of bots, like Glia and LivePerson. Also SAP conversational AI and Salesforce.