Photo by Vackground on Unsplash

NVIDIA Riva & Google Dialogflow

And What I Have Learnt By Building A Prototype

Cobus Greyling
7 min readNov 25, 2021

--

Introduction

The NVIDIA Riva demo application illustrates how a chatbot and voicebot configuration can make use of different technologies in different scenarios for NLU/P and dialog state management.

This demo application incorporates two very different approaches to dialog state management, in making use of both Google Dialogflow and Rasa Machine Learning stories. In the case of Rasa the NLU portion can vest in Riva or Rasa.

Hence this illustrates how the best of breed can be employed to constitute a conversational AI solution. And it also highlights the flexibility of the Riva framework.

NVIDIA Riva, Rasa & Google Dialogflow

NVIDIA is busy developing Riva Studio which will help in building applications such as chatbots, virtual assistants and multimodal virtual assistants that leverage Riva skills.

This eludes to the fact that NVIDIA will have a skills approach, like Microsoft, IBM etc. Looking at the demo examples, it could stand NVIDIA in good stead to have a more disparate approach in dialog state management.

I have written in the past how a Riva dialog state management environment might look.

The Four Pillars Of Traditional Chatbot Architecture

In general, Chatbots have four pillars:

  1. Intents (NLU)
  2. Entities (NLU)
  3. Dialog State Management
  4. Script (NLG) Management

Points 1 & 2, and 3 & 4 are usually tightly coupled. In general these four points are often seen as necessarily combined in one framework, from one vendor.

For a Voicebot two additional technologies need to be added:

  1. Speech To Text (Advanced Speech Recognition, STT)
  2. Text To Speech (Speech Synthesis, TTS).

As it stands, NVIDIA Riva have the following technologies from a Conversational AI perspective: TTS, STT, NLP & NLU. The element which is currently not part of the Riva offering is dialog state development and management.

The NVIDIA Riva & Google Dialogflow demo architecture.

The NVIDIA Riva demo application with Google Dialogflow makes use of Dialogflow’s NLU and Dialog State Management. This approach is in stark contrast to the Rasa demo, where the dialog is developed vastly differently.

For a detailed article on how to install and run NVIDIA Riva, red more here.
For the NVIDIA Riva demo making use of
Rasa, read more here.
Or, for the NVIDIA Riva Jupyter Notebook examples, go
here.

The assistant making use of Rasa, demonstrates the integration of Rasa and the Riva Speech Service in the form of a weather chatbot web application.

This demo application is available in two configurations:

Configuration 1: Riva ASR + Riva TTS + Riva NLP + Rasa dialog manager

Configuration 1: Riva ASR, Riva TTS, Riva NLP/U & Rasa dialog manager

This shows how NVIDIA Riva can be used to extend a Rasa chatbot as a voicebot making use of Riva TTS & STT.

Added to this, the full functionality of Riva can be employed, and only Rasa leveraged for the Dialog Management.

Configuration 2: Riva ASR + Riva TTS + Rasa NLU + Rasa dialog manager

Configuration 2: Riva ASR, Riva TTS & Rasa NLP/U, Rasa Dialog Manager

Hence it is clear that different components can be combined for the best suited solution, existing implementations can also be leveraged by Riva.

Riva & Google Dialogflow

This Virtual Assistant, making use of Google Dialogflow shows the integration of Google Dialogflow and Riva Speech Services in the form of a weather chatbot web application.

This demo shows Riva being used for ASR and TTS and Google Dialogflow for NLP and Dialog State Management (DM).

The integration uses the native API support of Google Dialogflow and gRPC support in Riva.

The Weatherbot Client coordinates the workflow with Riva Services and Dialogflow, then interacts with the end-user via a web UI.

There are three primary parts to this solution; Riva AI Services, Dialogflow Weatherbot, and the Weatherbot Client application.

After installing Riva, some work is required on the Google side.

Create a Google Cloud Project

Enable the API and save the connection credentials. Enabling the Dialogflow API.

Enable the API and save the connection credentials.

Follow the Set up Authentication instructions. When done, run the command and use the service account key file in your environment step in the Riva Samples container.

Follow the Set up Authentication instructions. When done, run the command and use the service account key file in your environment step in the Riva Samples container.

Click the Setting button next to the agent name in the Dialogflow console.

Under the Export and Import tab, choose Restore From ZIP and upload the zipped folder from your host.

<Path to riva-dialogflow-va-temp>/dialogflow-weatherbot/dialogflow-weatherbot.zip

This will load the Dialogflow application within the Dialogflow Essentials workspace.

A key next step is to add the integration under Fulfillment. After opening the Fulfillment section, enable the Inline Editor in the Dialogflow console. Then paste the code supplied in the fulfillment folder.

To enable the Inline Editor in Dialogflow Fulfillment, billing is required.

According to the NVIDIA documentation, billing is not required. However, without billing activated, Fulfillment cannot be used.

After activating the environment, navigate to the chatbot client folder and start the chatbot web server.

Open the web UI at:

https://0.0.0.0:6006/rivaWeather
A screenshot of the chat interface with speech and text input and output.

The interface giving a valid error stating the Dialogflow API could not be called, which is expected with Fulfillment not configured.

The easiest way to access the web interface is to make use of SSH tunneling. When the chatbot web server is started in the SSH window, the port is displayed. Use this port to setup a SSH tunnel to access the URL via a browser remotely.

Conclusion

The positives are overwhelming…

  • Use can be made of any compatible approach to Dialog State Management which is well illustrated by the demo applications. NVIDIA is not tied, currently, to a specific methodology.
  • Implementations can be cloud, or local/edge.
  • Riva speaks to mission critical, industrial strength cognitive services & Conversational AI.
  • A new framework for high-performance ASR, STT and NLU.
  • Developers have access to transfer learning and the leveraging the investment made by NVIDIA.
  • The NVIDIA GPU environment addresses mission critical requirements, where latency can be negated.
  • Clear roadmap for Riva in terms of the near future and imminent features.
  • Riva addresses requirements for ambient ubiquitous interfaces.
  • Riva Demo applications from NVIDIA illustrates the flexibility of Riva. Accommodating different dialog state approaches. This can become their strong point, flexibility on an element where there is disparate approaches.

Considerations

  • Access, development and deployment seem daunting and the framework appears complicated. In this article I wanted to debunk access apprehensions. However, production deployment will most certainty be complex.
  • Most probably for a production environment specific hardware considerations will be paramount; especially where cloud/connectivity latency cannot be tolerated.

The services available now via Riva are:

  • Speech recognition trained on thousands of hours of speech data with stream or batch mode.
  • Speech synthesis available in batch and streaming mode.
  • NLU API’s with a host of services.

As stated before, the advent of Riva will surely be a jolt to the current marketplace, especially with imbedded conversational AI solutions.

The freedom of installation and the open architecture will stand NVIDIA in good stead. As noted, production architecture and deployment will demand careful consideration.

--

--

Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI. www.cobusgreyling.com