Photo by Mehmet Turgut Kirkgoz on Unsplash

Conversation & Document Summarisation Have Been Added To Microsoft Cognitive Services

Microsoft Cognitive Services have been known for STT, TTS and the NLU prowess of LUIS. Of late it seems like Microsoft is expanding into the area of Large Language Models. One example if this is their recently announced summarisation feature.

Cobus Greyling
8 min readAug 16, 2022

--

Introduction

On 24 May 2022 Microsoft introduced summarisation redaction for conversations and documents.

Summarisation is now one of the features offered by Azure Cognitive Service for Language.

Both document and conversation summarisation can be implemented using chat logs and speech transcripts, documents or any snippets of text which require redaction. Microsoft sees this summarisation tool as a ready-for-use interface to enable end-to-end analysis of speech conversations, from audio to transcript to insights and more. Considering LLM functionality at large, summarisation is part and parcel of the current offering one could argue.

Certain functionality has become synonymous with Large Language Models (LLM’s), as listed below. The list of five implementations or functionality groups are represented across the commonly known LLM’s like OpenAI, Cohere, AI21labs, etc.

These LLM’s have playgrounds and API’s. What has been missing is a studio or no-code interface to leverage these LLM’s. HumanFirst is leading the way with their POC integration to Cohere and NVIDIA has also alluded to a similar approach for ease of access to LLM’s.

Generation and Summarisation

Most of the LLM providers follow the same basic approach, where Generation is one of a handful of LLM functions. These functions or groupings might be few, but are exceptionally powerful and are only found in the domain of LLM.

By making use of casting, the Generation API can be leveraged to generate text based on the casting. Casting is the combination of training instructions and the text to be used as reference. The training instructions can be seen as a type of few shot training.

Below is a summarisation example from OpenAI’s playground, where the instruction (cast) is given: “Summarise the following:”, followed by the text to perform the summarisation on.

Back To Microsoft

Microsoft has two specific API’s for summarisation, one aimed at conversations between customers and service representatives, and one to summarise documents.

According to Microsoft, these two features can operate on both chat logs and speech transcripts, allowing seamless integration with Microsoft’s Cognitive Service for Speech. An example of a ready-for-use solution is Ingestion Client, which enables end-to-end analysis of speech conversations, from audio to transcript to insights.

This first release is trained with GPT-3, focussing on the needs of customer support and call centres.

What makes Microsoft’s approach different is that they have specific API’s and specific use-cases in mind. The summarisation API will form part of their CCAI strategy as an agent-assist feature.

According to Microsoft:

Customer support agents typically spend 5–15 minutes writing notes when wrapping up each call or text chat, or when they transfer a case to the next level of support. This considerable time and effort significantly slow down the time to resolution. Our new feature automatically generates a summary of issues and resolutions from a two-party conversation, especially between a customer and an agent, which can greatly reduce case handling time, increase agents’ job satisfaction, sustain high customer engagement, improve customer experiences, as a result boost customer loyalty. Built with this API offering in Azure Cognitive Services, Dynamics 365 Customer Service now enables this capability out-of-box for their customers

Demo Time

Document Summarisation

Starting with the document summarisation…

In order to make use of the API, you will need to create an Azure resource. This can be done using free credits/trial period prior to going on to pay-as-you-go. But you will need to enter your credit card details.

Once you have created the Azure resource, click on Keys and Endpoint to view the endpoint you will be using to access the API and the access keys.

Microsoft supplies the CURL commands which can be edited with your resource endpoint and access key.

As seen below, once the command is sent, an apim-request-id code is returned.

This is the subsequent API call to retrieve the document summarisation using the request ID, which was returned by the previous request. Hence, seemingly this is an asynchronous/batch approach process.

The input API code:

And requesting the results…

The full JSON response…

Conversation Summarisation

Below the test conversational data supplied by Microsoft…

And the result view from the Microsoft documentation:

Unfortunately conversation summarisation is currently a gated public preview feature for which you need to apply and there is a waiting period of 10 days.

Conclusion

With regard to LLM’s there are a few things happening…three of note…the first is that there are a handful of companies specialising in Large Language Models in varying degrees. As mentioned at the onset of this article, these are OpenAI, Cohere, AI21labs, etc.

Secondly, there are models being open-sourced which are democratising access to large language models.

Thirdly, the traditional cloud platforms are looking at integrating LLM’s into their products. Making it easier for organisation to include documents into their search and knowledge bases. This approach makes the integration of LLM’s seamless and LLM’s acts as a supporting feature, disappearing in the background.

Often the question has been asked, what is the real-world production value of LLM’s. Seemingly Microsoft is trying to change this, with specific use-case based API’s and positioning these LLM implementations as supporting technology in orchestrating initiatives like CCAI.

https://www.linkedin.com/in/cobusgreyling/
https://www.linkedin.com/in/cobusgreyling/

https://techcommunity.microsoft.com/t5/ai-cognitive-services-blog/announcing-preview-of-two-conversation-features-in-cognitive/ba-p/3402905

--

--