Rasa Introduced Incremental Training For Chatbots
See How This Feature Not Only Saves Time But Fuels CDD
Introduction
I am always surprised hearing people say Rasa is a difficult or a technical challenging environment. From a prototyping perspective, it’s in reality an easy and extremely intuitive platform with a very low barrier to entry.
I say this, because you do not need:
- Specialized or specific hardware
- Specialized pre-installed software or
- A complex install stack to manage and keep in sync.
- A specific or single operating system.
In general, the only real complaint I read regarding Rasa was training time.
Granted, training is being done on most probably low grade hardware etc. etc. However, when it comes to training performance, one needs to keep in mind that Rasa does not dictate and demand very specific and complex hardware environments. This in turn yields massive accessibility. Hence many users run the Rasa framework on their regular workstation or laptops.
This makes incremental training (finetuning) paramount. At some stage I wondered if Rasa will ever look at a transfer learning approach like NVIDIA Jarvis...but I subsequently realized NVIDIA not only plays to specific hardware, but also to broader and more general domains; hence, non-domain specific implementations. Think here of an interface in a car etc.
Rasa's focus is perhaps for more narrow, and domain specific implementation; instances where organizations want to use their own data.
And having the least dictation on supporting hardware and software requirements.
Looking at complexity in isolation, the NVIDIA Jarvis environment is much more challenging in my opinion.
A key focus are of Jarvis is transfer learning. There is significant cost saving when it comes to taking the advanced base models of Jarvis and repurposing them for specific uses.
The functionality which is currently available in Jarvis 1.0 Beta includes ASR, STT and NLU. This introduces complexity and hardware demands.
There Are Conditions To Incremental Training
Incremental training does demand a few basic conditions. The first being that the configuration must remain the same; the only parameter you are allowed to adjust is the epoch (more about this later).
The second condition is that you cannot add new:
- Intents
- Actions
- Entities
- Slots
- ML Stories
In essence the domain.yml file is a no-go for incremental training. What is convenient, whenever changes where made to the model which cannot be accommodated by incremental training, you will get an error upfront stating NLU model can not be finetuned.
But the good news is you can add training data, as in intent examples, entity examples, with different combinations to be added via incremental training.
Existing ML stories can be edited and included in the incremental training. I tested this extensively and the training time is significantly shorter with incremental training than retraining the whole model from scratch.
rasa train --finetune --epoch-fraction 0.2
Above is the command I used for training. As you can see, the number of epochs can be set. This flag informs the training process to run for a fraction of the epochs specified in the model configuration.
For example, if a normal training cycle was set to 100 epochs, the command below would run a fine-tuning training cycle for 20 epochs.
Fueling Conversation Driven Development
Collaboration is always important for larger teams and organizations; Rasa X provides this required collaboration environment. Rasa X turns a relatively complex tasks into an administrative endeavors.
Teams with product and process knowledge pertaining to the organization, can edit and add training NLU data. And view, edit and compare ML Stories.
Incremental Training comes into its own with Rasa X as the model can be updated and enhanced with edits and smaller incremental training can be performed. With a cadence of complete training when time and resources permit.
Again, you can only edit ML Stories, and add/edit Intent and Entity training data. But imagine how errors, spelling, new products & services updates can be added with a regular cadence.
What is lacking at this stage is managing Incremental Training from within Rasa X. Hence training data can be added, prepared and corrected within Rasa X, executing and managing incremental training within Rasa X does not currently exist.
The Incremental training process can also assist with prototyping and this can be shared with an audience for testing and vetting via Rasa X’s testing tool.
As training data grows, incremental Training will become more crucial.
Bigger changes like adding intents, entities and ML Stories perhaps require some consideration and evaluation. I am not sure of such bigger changes are a good fit for any incremental improvements.
Conclusion
What does the future hold? And what might be practical to add to the Incremental Training feature?
Integration between Rasa X and Incremental Training is important, being able to train incrementally from Rasa X. This will most probably be the case when Incremental Training become a fixture in Rasa.
Rasa does make it clear that one should not persist with incremental training, but train the whole model from time-to-time.
There is still not clarity what would continuous incremental training do to the model’s performance. A cadence if incremental/finetuning training combined with full model training is essential. The frequency of mandatory full model training will most probably be yielded by tests in the near future…and with feedback from the community.