IBM Virtual Voice Creator

Cobus Greyling
3 min readJul 3, 2019

The IBM Virtual Voice Creator takes Text-To-Speech (TTS) synthesis technology to the next level

The Virtual Voice Creator (VVC) allows enterprises and users to create unique voices on-demand and on the fly; in an easy and fast manner.

With the advent of Conversational Interfaces and Ambient Computing, there is a growing need for people to speak to devices; and the expectation often is that the devices speak back.

VVC allows for the creation of an entirely virtual voice which can be changed on the fly as the situation demands. Exaggerated voices can also be created for settings where it is applicable.

According to IBM, “ This TTS technology employs the unit selection synthesis approach, that, as of today, provides the most natural sound and intonation achievable with modern commercial TTS systems. Watson TTS comes with a set of rich and meticulously cleaned standard voice datasets.

The IBM Virtual Voice Creator adds unique voice transformation capabilities to Watson TTS. We use a sophisticated offline analysis process to prepare the standard voice datasets for transformations that alter voice qualities and perceived speaker identity at synthesis time. The transformations modify various aspects of the voice components associated with the key organs of human speech production mechanism: the vocal folds and vocal tract.

The following speech samples demonstrate the effects of individual voice modifications.”

The web based GUI allows you to configure voices and really experiment in a fun and interactive way. you can store your configurations and use it in future to synthesize any text.

This is all delivered in the cloud.

From the left pane you can select an Original Voice. By clicking on the voice, the work area is populated with different elements which can be manipulated. these include: Vocal Tract, Pitch, Phonation, Speed, Effects and Mood. Effects include Hoarse, Growling and Trembling. Mood include Excited and Sad. Also, Phonation include elements Soft, Crisp, Tense and Breathy.

There are Gallery Voices and User Voices.

Another helpful feature is the Script-to-Audio page. here you can paste or upload a file containing a script and different characters. Once you have added your script, click the Add Voices button.

This allows you to select virtual voices and assign them to the different roles in your script. You can choose different colors associated with a role to have a better visual representation of the script.

Once completed, click the Audio download button at the right-upper corner to get all the audio saved locally to your machine as a ZIP archive containing section-wise audio files.

Before closing the session push Save button in the right-upper corner to save your project locally as a ivvc-file. Next time you enter the Script-to-Audio page you can Open the ivvc-file and continue working on the script from the point where you stopped.

--

--

Cobus Greyling
Cobus Greyling

Written by Cobus Greyling

I’m passionate about exploring the intersection of AI & language. www.cobusgreyling.com

No responses yet