If you are deeply and truly curious, you are someone who likes to understand, and likes to get something working by creating your own environment and self-installing…
Reading documentation, guides and tutorials do lend some degree of insight.
But mastery starts with creating an environment, this might be virtual or physical hardware. Then setting up the development and runtime environment, and running at least a demonstration application.
Getting stuck at the point of reading, really creates a false sense of mastery and understanding. It is only once you start building the environment and getting demo applications to run, where you learn by doing.
When I got access to the NVIDIA Riva framework, I really wanted to get to grips with the environment. It is easy to be overwhelmed when getting started with an environment like Riva. There are often mental barriers when confronted with deep learning and using an astute environment like NVIDIA.
Riva brings deep learning to the masses, and this article is not a tutorial, but rather a guide on how to:
- Start as small and simple as possible.
- Become familiar with the environment, to some extend at least.
- And spiral your prototype outwards with measured iterations from an initial prototype with increasing functionality and complexity.
The multimodal aspect of Riva is best understood in the context of where NVIDIA wants to take Riva in terms of functionality.
What is exciting about this collection of functionality, is that Riva is poised to become a true Conversational Agent.
We communicate as humans not only in voice, but by detecting the gaze of the speaker, lip activity etc.
Another key focus are of Riva is transfer learning. There is significant cost saving when it comes to taking the advanced base models of Riva and repurposing them for specific uses.
The functionality which is currently available in Riva includes ASR, STT and NLU.
Virtual Voice Assistant
This Virtual Assistant sample application demonstrates how to use Riva AI Services, specifically ASR, NLP, and TTS, to build a simple but complete conversational AI application.
It demonstrates receiving input via speech from the user, interpreting the query via intent recognition and slot filling approach, compiling a response, and speaking this back to the user in a natural voice.
Read more about the installation process here.
To install and run the Riva Voicebot demo, start your Riva services:
Download the samples image from NGC.
docker pull nvcr.io/nvidia/riva/riva-speech-client:1.4.0-beta-samples
Run the service within a Docker container.
docker run -it --rm -p 8009:8009 nvcr.io/nvidia/riva/riva-speech-client:1.4.0-beta-samples /bin/bash
Within this directory…
config.py with the right Riva IP, hosting port and your weatherstack API access key (from https://weatherstack.com/). Then, start the server.
Getting your weatherstack API, on the free tier…
To take a closer look at example code for ASR, TTS and NLU take a look at the Jupyter Notebook examples…
My first thought was that getting past the point of an own installation and running the demos would be very daunting…seeing this is NVIDA and deep learning etc.
But on the contrary, getting to grips with Riva on a demo application level was straight forward when following the documentation. After running this basic demo voicebot, what are the next steps?
Subscribe to my newsletter.
NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer, Ubiquitous User Interfaces, Ambient…
Cobus Greyling - Medium
Read writing from Cobus Greyling on Medium. NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer…
NVIDIA: World Leader in Artificial Intelligence Computing
NVIDIA, inventor of the GPU, which creates interactive graphics on laptops, workstations, mobile devices, notebooks…
NVIDIA RIVA NVIDIA Riva is a GPU-accelerated SDK for building multimodal conversational AI applications that deliver…