Photo by Izzy Gibson on Unsplash

DeepPavlov Is An Open Source Conversational AI Framework

Here Are A Few Key Concepts & How To Start using DeepPavlov

Cobus Greyling
8 min readNov 16, 2021

--

Introduction

In general, the aim of most chatbot development frameworks, is to create an environment which allows medium-level technical people easy onboarding.

And only performing NLP allows for a simple data-in-data-out environment.

As a conversational agent grows and evolves, more complexity is introduced, considering elements like dialog management, maintaining context. Also including other conversational elements like:

  • Disambiguation
  • Digression
  • Managing Fallback-Proliferation
  • Re-Establishing Context
  • Auto Learning
  • Domain & Irrelevance
  • Handling Compound Intents
  • Mixed Modality & Conversational Components
  • Contextual Entities
  • Variation
As chatbot development frameworks move from a No-Code environment all the way op to native code (pro-code), the ability to fine-tune increases. And in most cases the barrier to entry also increases.

Hence there needs to be flexibility but also an interface to develop and manage the dialog state management. The challenge is to have a natural and adaptive dialog which is also predictable and manageable.

The more no-code or low-code the solution becomes, the more the fine-tuning options diminish. The more fine-tuning, complexity.

There is been much talk about the low-code approach to software development and how it acts as a catalyst for rapid development. And how it acts as a vehicle for delivering solutions with minimal bespoke hand-coding.

Low-code interfaces are made available via a single or a collection of tools which is very graphic in nature; and initially intuitive to use. Thus delivering the guise of rapid onboarding and speeding up the process of delivering solutions to production.

As with many approaches of this nature, initially it seems like a very good idea. However, as functionality, complexity and scaling start playing a role, huge impediments are encountered.

When someone refers to the ability or the extend to which fine-tuning can be performed, what exactly are they referring to? In this section we are going to step through a few common elements which constitutes fine-tuning.

  • Forms & Slots
  • Intents
  • Entities
  • Natural Language Generation (NLG)
  • Dialog Management
  • Digression
  • Disambiguation

Where Does DeepPavlov Fit In?

DeepPavlov finds itself definitely at the higher end of the spectrum; being a native/pro-code framework with a machine learning approach.

DeepPavlov refers to a Semantic Frame. This includes Natural Language Understanding, encompassing Domain Detection, Intent and Entities.

DeepPavlov refers to a Semantic Frame. This includes Natural Language Understanding, encompassing Domain Detection, Intent and Entities.

In the DeepPavlov world, a digital agent is constituted by a collection of skills, which is managed by a Skills Manager.

A skill is made up by different Components.

  • A skill fulfills the user goal in a particular domain.
  • A Model is any NLP model that doesn’t necessarily communicates with the user in natural language.
  • Components are reusable reusable functional parts of a model or skill.
  • There are rule-based models and ML Models.
  • ML Models can be trained independently and in an end-to-end mode being joined in a chain.
  • The Skill Manager performs selection of the correct skill to generate the response.
  • A chainer builds a model pipeline from heterogeneous components (Rule-based/ML/DL). It allows one to train and infer

Goal-Oriented Bot In DeepPavlov

The framework needs to be provided with a dataset (RASA or DSTC2 ), train the model, download it, and then use it by either calling them natively from Python or by rising it as microservices and then calling them via its standard DeepPavlov REST API.

Currently, DeepPavlov support two ways to define domain model and behavior of a given goal-oriented skill — RASA (domain.yml, nlu.md, stories.md) or a DSTC2 format.

The training/validation/test data are stored in JSON files…below is an example of training data:

[
[
{
"speaker": 2,
"text": "Hello, welcome to the Cambridge restaurant system. You can ask for restaurants by area, price range or food type. How may I help you?",
"slots": [],
"act": "welcomemsg"
},
{
"speaker": 1,
"text": "cheap restaurant",
"slots": [
[
"pricerange",
"cheap"
]
]
},
{
"speaker": 2,
"text": "What kind of food would you like?",
"slots": [],
"act": "request_food"
},
{
"speaker": 1,
"text": "any",
"slots": [
[
"this",
"dontcare"
]
]
},
{
"speaker": 2,
"text": "What part of town do you have in mind?",
"slots": [],
"act": "request_area"
},
{
"speaker": 1,
"text": "south",
"slots": [
[
"area",
"south"
]
]
},
{
"speaker": 2,
"text": "api_call area=\"south\" food=\"dontcare\" pricerange=\"cheap\"",
"db_result": {
"food": "chinese",
"pricerange": "cheap",
"area": "south",
"addr": "cambridge leisure park clifton way cherry hinton",
"phone": "01223 244277",
"postcode": "c.b 1, 7 d.y",
"name": "the lucky star"
},
"slots": [
[
"area",
"south"
],
[
"pricerange",
"cheap"
],
[
"food",
"dontcare"
]
],
"act": "api_call"
},
{
"speaker": 2,
"text": "The lucky star is a nice place in the south of town serving tasty chinese food.",
"slots": [
[
"area",
"south"
],
[
"pricerange",
"cheap"
],
[
"name",
"the lucky star"
],
[
"food",
"chinese"
]
],
"act": "inform_area+inform_food+offer_name"
},

You can now iterate over batches of preprocessed DSTC-2 dialogs:

User utterances:
----------------
[ {'prev_resp_act': None, 'text': ''},
{'prev_resp_act': 'welcomemsg', 'text': 'id like to find a restaurant'},
{'prev_resp_act': 'request_pricerange', 'text': 'in appleton wiscon'},
{ 'db_result': { 'addr': '88 mill road city centre',
'area': 'centre',
'food': 'chinese',
'name': 'rice house',
'phone': '01223 367755',
'pricerange': 'cheap'},
'prev_resp_act': 'api_call',
'text': 'in appleton wiscon'},
{'prev_resp_act': 'offer_name', 'text': 'restaurant'},
{ 'prev_resp_act': 'offer_name',
'slots': [['food', 'irish']],
'text': 'irish food'},
{ 'db_result': {},
'prev_resp_act': 'api_call',
'slots': [['food', 'irish']],
'text': 'irish food'},
{'prev_resp_act': 'canthelp_food', 'text': 'appleton wisconsin'},
{'prev_resp_act': 'canthelp_food', 'text': 'thank you'},
{'prev_resp_act': 'canthelp_food', 'text': 'good bye'}]
System responses:
-----------------
[ { 'act': 'welcomemsg',
'text': 'Hello, welcome to the Cambridge restaurant system. You can '
'ask for restaurants by area, price range or food type. How '
'may I help you?'},
{ 'act': 'request_pricerange',
'text': 'Would you like something in the cheap, moderate, or expensive '
'price range?'},
{ 'act': 'api_call',
'text': 'api_call area="dontcare" food="dontcare" '
'pricerange="dontcare"'},
{ 'act': 'offer_name',
'slots': [['name', 'rice house']],
'text': 'Rice house is a great restaurant.'},
{ 'act': 'offer_name',
'slots': [['name', 'rice house']],
'text': 'Rice house is a great restaurant.'},
{ 'act': 'api_call',
'slots': [['food', 'irish']],
'text': 'api_call area="dontcare" food="irish" pricerange="dontcare"'},
{ 'act': 'canthelp_food',
'slots': [['food', 'irish']],
'text': 'I am sorry but there is no irish restaurant that matches your '
'request.'},
{ 'act': 'canthelp_food',
'slots': [['food', 'irish']],
'text': 'I am sorry but there is no irish restaurant that matches your '
'request.'},
{ 'act': 'canthelp_food',
'slots': [['food', 'irish']],
'text': 'I am sorry but there is no irish restaurant that matches your '
'request.'},
{'act': 'bye', 'text': 'You are welcome!'}]

Testing the bot after training in a Collab Notebook. The link to the notebook is below.

Interacting with the chatbot via a Google Collab Notebook.

More on Dialog Development Approaches

Read more detail here

Design Canvas

Advantageous of this approach are:

  • Ease of collaboration
  • Panning and viewing of the design
  • Zoom in and out to see more or less detail
  • Combining of the design and development process.
  • Suitable for quick prototyping & cocreation.

Disadvantageousness of this approach are:

  • Complexity of large implementations
  • Change management and impact assessments
  • Troubleshooting and identifying conversation break points.
  • Multiple conditions per dialog node which are impacted when parameters change.

Dialog Configuration

Advantageous of this approach are:

  • Slightly more condensed presentation of the conversation
  • Restrictive nature prohibits impulsive changes.
  • More technical in nature with varying levels of configuration.
  • Suitable for quick prototyping.

Disadvantageousness of this approach are:

  • Difficult to present and perform walk-through
  • For larger conversations there is mounting complexity and cross-referencing.
  • Mindfulness of how parameter and settings changes will cascade.
  • Not suited as a conversation design tool.

Native Code

Advantageous of this approach are:

  • Non-propriety in terms of development environment language.
  • Flexible and accommodating to change in scaling (in principle)
  • Non-dedicated, specific skills or specific knowledge required.
  • Porting of code, or even re-use.

Disadvantageousness of this approach are:

  • Design and implementation is far removed from each other.
  • Design interpretation might be a challenge.
  • Another, most probably dedicated, design tool will be required.
  • The complexity of managing different permutations in the dialog still needs to exist; within the code.

ML Stories

Advantageous of this approach are:

  • Everyone knows the state machine needs to be deprecated; this achieves that.
  • Training time is reasonable.
  • No dedicated or specific hardware required.
  • No dedicated ML experts and data scientists required…AI for the masses.
  • Complexity is hidden in presented in a simplistic way.

Disadvantageousness of this approach are:

  • This approach may seem abstract and intangible to some.
  • Apprehension in instances where mandatory data needs to be collected. Or where legislation dictates conditions. However, here Form Policies comes into play.

Conclusion

I found quite a few links on the DeepPavlov site broken, and some of the demo’s did not work. However, there is no doubt regarding the technical and architectural astuteness of the DeepPavlov conversational AI framework.

DSTC2 seems very similar to Rasa ML Stories.

Advantages:

  • DeepPavlov is open and extremely configurable.
  • Scalability & Flexibility is paramount.
  • Comprehensive documentation.

Challenges:

  • Extremely steep learning curve.
  • Complex and challenging environment.
  • Performing NLP tasks will be much easier than a conversational interface.
  • Building and productionizing a large scale solution will demand remarkable technical astuteness.

--

--

Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI. www.cobusgreyling.com