If you are deeply and truly curious, you are someone who likes to understand, and likes to get something working by creating your own environment and self-installing…
Reading documentation, guides and tutorials do lend some degree of insight.
But mastery starts with creating an environment, this might be virtual or physical hardware. Then setting up the development and runtime environment, and running at least a demonstration application.
Getting stuck at the point of reading, really creates a false sense of mastery and understanding. It is only once you start building the environment and getting demo applications to run, where you learn by doing.
When I got access to the NVIDIA Jarvis framework, I really wanted to get to grips with the environment. It is easy to be overwhelmed when getting started with an environment like Jarvis. There are often mental barriers when confronted with deep learning and using an astute environment like NVIDIA.
Jarvis brings deep learning to the masses, and in this article is not a tutorial, but rather a guide on how to:
- Start as small and simple as possible.
- Become familiar with the environment, to some extend at least.
- And spiral your prototype outwards with measured iterations from an initial prototype with increasing functionality and complexity.
The multimodal aspect of Jarvis is best understood in the context of where NVIDIA wants to take Jarvis in terms of functionality.
What is exciting about this collection of functionality, is that Jarvis is poised to become a true Conversational Agent. We communicate as humans not only in voice, but by detecting the gaze of the speaker, lip activity etc.
Another key focus are of Jarvis is transfer learning. There is significant cost saving when it comes to taking the advanced base models of Jarvis and repurposing them for specific uses.
The functionality which is currently available in Jarvis 1.0 Beta includes ASR, STT and NLU.
Virtual Voice Assistant
This Virtual Assistant sample application demonstrates how to use Jarvis AI Services, specifically ASR, NLP, and TTS, to build a simple but complete conversational AI application.
It demonstrates receiving input via speech from the user, interpreting the query via intent recognition and slot filling approach, compiling a response, and speaking this back to the user in a natural voice.
Read more about the installation process here.
To install and run the Jarvis Voicebot demo, start your Jarvis services:
Download the samples image from NGC.
docker pull nvcr.io/nvidia/jarvis/jarvis-speech-client:1.0.0-b.1-samples
Run the service within a Docker container.
docker run -it --rm -p 8009:8009 nvcr.io/nvidia/jarvis/jarvis-speech-client:1.0.0-b.1-samples /bin/bash
Within this directory…
config.py with the right Jarvis IP, hosting port and your weatherstack API access key (from https://weatherstack.com/). Then, start the server.
Getting your weatherstack API, on the free tier…
Getting your API Access Key…
Get Your API Access key and update config.py
Start the service…
Below you can see the NVIDIA Jarvis weather bot accessible on the url https://127.0.0.1:8009/jarvisWeather/. Again, you will have to setup to SSH tunneling from your virtual machine. Read about that here.
To take a closer look at example code for ASR, TTS and NLU take a look at the Jupyter Notebook examples…
My first thought was that getting past the point of an own installation and running the demos would be very daunting…seeing this is NVIDA and deep learning etc.
But on the contrary, getting to grips with Jarvis on a demo application level was straight forward when following the documentation. After running this basic demo voicebot, what are the next steps?
Subscribe to my newsletter.
NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer, Ubiquitous User Interfaces, Ambient…
Cobus Greyling - Medium
Read writing from Cobus Greyling on Medium. NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer…
NVIDIA: World Leader in Artificial Intelligence Computing
NVIDIA, inventor of the GPU, which creates interactive graphics on laptops, workstations, mobile devices, notebooks…
NVIDIA JARVIS NVIDIA Jarvis is an application framework for multimodal conversational AI services that delivers…