Using Wolfram Natural Language Processing For Data Visualization
And Making Your Chatbot Visually Compelling
Introduction
Data visualization can bring data to life, and turn your chatbot into a seasoned storyteller. Data visualization can help users to rapidly and effectively develop an understanding, all based on the subject of the conversation.
Imagine someone is asking a chatbot about the FIFA World Cup, and a map is returned marking all the countries historically involved in the Word Cup, and according to frequency?
Or, on a topic like Mars, a word cloud is presenting words sized according to frequency?
This is a twist on traditional Natural Language Processing (NLP), where data is represented graphically to the user.
We know that presenting data in such a way lowers the cognitive load on the user; making the information more digestible. The beauty of course is that the graphic component is generated on the fly during the conversation. All based on the data retrieved. The NLP examples discussed in this story are but a few of the functions available by using the Wolfram Language within a Wolfram Notebook.
But what is Wolfram?
Wolfram Language is a new kind of general computer language, which redefines what’s practical to do with computers.
Wolfram Notebooks have an interactive sequence of inputs. It is ideal to learn, explore and write programs in the Wolfram Language. Wolfram Language can also operate without its own interactive interface, in a whole variety of software-engineering configurations.
Now, Version 12 takes advantage of the recent advances in deep learning to bring state-of-the-art capabilities in natural language understanding.
A collection of pre-trained neural net models is available to be used as is or fine-tuned to a specific language task. The best way to form an understanding of this, is to look at a few simple, yet impressive demos.
Date Extraction & Visual Representation
These are some of the most simple, yet impressive examples.
Find Locations in Text
In this example a Wikipedia entry is extracted on the subject of rice. A snippet or head of the extract can be displayed to ensure the data is extracted correctly.
Entities, in this case cities, countries, etc. can be located on a map. TextCases allows you to find all these entities at once. It is evident from this code block, how simple the commands are.
text = WikipediaData["Rice"];
Snippet[text, 4]locations = TextCases[text, "LocationEntity" -> "Interpretation", VerifyInterpretation -> True];GeoBubbleChart[Counts[locations]]ReverseSort@Counts[Cases[locations, Entity["Country", _]]]
The Wolfram cloud notebook is a familiar environment if you are use to something like Jupyter Notebooks. After the data is extracted, the locations can be displayed on a world map according to location and frequency.
The countries are also sorted according to the counts of each.
Harvest Entities
Here we take a Wikipedia entry on Mars, and extract all the entities. The output is a neat table with string, type, position and more.
moon = WikipediaData["Moon"];
Snippet[moon, 5]
contents = TextContents[moon, VerifyInterpretation -> True]
The entities can be counted with a single line of code, and a world cloud can be formed. I see this as an immensely graphic and intuitive design affordance for disambiguation.
A user might ask about a vast subject like mars for an open domain chatbot. For a domain specific chatbot it could also apply, where it is a very general intent. The chatbot can respond with a word cloud, and the size of the text nudging the user to the most relevant intent.
counts = ReverseSort@CountsBy[contents, #Type &]
WordCloud[counts]
Find Dates in Text
This example demonstrate how to find dates or date elements (e.g. day, month, year, century) in text using TextCases.
For this example we take the subject of World War from Wikipedia.
text = WikipediaData["World War"];
Snippet[text, 4]sentences = TextCases[text, "Date" -> {"HighlightedSnippet", "Interpretation"}, VerifyInterpretation -> True];RandomSample[sentences, 3]
TimelinePlot[sentences[[All, 2]]]
TimelinePlot[ Association[Rule @@@ sentences], Sequence[ PlotRange -> {"1900", "1950"}, PlotLayout -> "Vertical", Background -> LightBlue]]
Imagine the practical implementation of this, where data like usage, sequence of events etc. needs to be displayed to the user. With a simple graphic, a mental picture can instantly be delivered to the user.
Conclusion
Chatbots are very much dependent on the design affordances available within each of the host mediums.
Being amble to graphically represent data is powerful in any conversation.
Obviously this is not the exclusive domain of Wolfram. And there are other avenues to achieve the same result. However, looking at Wolfram’s NLP approach it is clear that they are approaching the problem from a unique perspective.
Other considerations will be cost, latency in rendering and complexity of implementation.