Four Levels of RAG — Research from Microsoft

Improving Retrieval-Augmented Generation (RAG) involves classifying queries based on user intent & focusing on context. Also utilising SLMs and fine-tuning to deliver more accurate & relevant results.

5 min readNov 13, 2024

In Short

Selecting the right RAG (Retrieval-Augmented Generation) architecture depends primarily on the specific use case and implementation requirements, ensuring the system aligns with task demands.

Agentic RAG is set to grow in importance, aligning with the concept of Agentic X, where agentic abilities are embedded within personal assistants, workflows, and processes.

Here, the “X” represents the boundless adaptability of agentic systems, enabling seamless task automation and informed decision-making across diverse contexts for enhanced organisational efficiency and autonomy.

Synthesising diverse document sources is crucial for addressing complex, multi-part queries effectively.

Introduction

The challenge of delivering an accurate RAG implementation includes retrieving relevant data, interpreting user intent accurately, and leveraging LLMs’ reasoning abilities for complex tasks.

Reasoning can be enhanced via an Agentic approach to RAG like ReAct, where a reasoning and act sequence of events are created.

Something I found interesting from this study is the fact that it states that there is no single solution that fits all data-augmented LLM applications.

Context refers to the information surrounding a conversation that helps the AI understand the user’s intent and provide relevant, coherent responses.
This includes factors such as the user’s previous inputs, the current task, the environment, and any external data that might influence the conversation.
Effective context handling enables the AI to maintain a consistent and personalised dialogue, adjusting responses based on the ongoing interaction and ensuring that the conversation feels natural and meaningful.

User Intent Detection

In many instances, system underperformance stems from either failing to pinpoint the main focus of a task or from tasks that require a combination of skills, which must be carefully separated for optimal results.

Intent refers to the underlying purpose or goal behind a user’s input, representing what the user wants to achieve or communicate through their query.
Recognising intent allows the AI system to respond appropriately.

RAG Data Classification

Level 1: Explicit Fact Queries

Directly request specific, known facts.

Queries are about explicit facts directly present in the given data without requiring any additional reasoning.

This is the simplest form of query, where the model’s task is primarily to locate and extract the relevant information. When a user asks a question, the RAG implementation targets a fact contained in the chunked data.

Level 2: Implicit Fact Queries

Seek facts indirectly, needing interpretation to identify the answer.

Queries are about implicit facts in the data, which are not immediately obvious and may require some level of common sense reasoning or basic logical deductions.

The necessary information might be spread across multiple segments or require simple inferencing.

For instance, the question What is the majority party now in the country where Canberra is located? can be answered by combining the fact that Canberra is in Australia with the information about the current majority party in Australia.

In level two we start to see the introduction of reasoning and action elements, hence a more agentic approach to RAG.

Level 3: Interpretable Rationale Queries

Focus on understanding reasoning behind facts and require data that supports logical explanation.

These queries require both factual knowledge and the ability to interpret and apply specific domain-based guidelines that are essential to the context of the data.

Such rationales are often provided in external resources but are rarely encountered in the initial pre-training of a general language model.

For example, in financial auditing, an LLM may need to follow regulatory compliance guidelines to assess if a company’s financial statements meet standards.

Similarly, in technical support, it may need to follow troubleshooting workflows to assist users, ensuring responses are precise and align with established protocols.

Level 4: Hidden Rationale Queries

Seek deeper insights, often requiring context-based reasoning to uncover underlying meanings or implications.

This category of queries requires the AI to infer complex rationales that aren’t explicitly documented, relying on patterns and outcomes observed within the data.

These hidden rationales involve implicit reasoning and logical connections that are challenging to pinpoint and extract.

For instance, in IT operations, a language model might analyse patterns from past incident resolutions to identify successful strategies.

Similarly, in software development, the AI could draw on past debugging cases to infer effective problem-solving methods. By synthesising these implicit insights, the model can deliver responses that reflect nuanced, experience-based decision-making.

Agentic Discovery

Interpretable and Hidden Rationales shift the focus to a RAG system’s ability to understand and apply the reasoning behind the data.

These levels require deeper cognitive processes, where the Agentic Framework aligns with expert knowledge or extracts insights from unstructured historical data.

According to the study and considering the image above, there is a distinction between queries requiring explicit facts and those dependent on implicit reasoning.

For example, a query about visa eligibility requires clear facts from the consulate’s guidelines (L3), while a question about the economic impact on a company’s future development demands an analysis of financial reports and economic trends (L4).

The data dependency in both cases underscores the importance of external sources — whether official documentation or expert analysis.

In both cases, providing rationales helps contextualise responses, offering not just answers but informed reasoning behind them.

Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.

Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs…

Large language models (LLMs) augmented with external data have demonstrated remarkable capabilities in completing…

arxiv.org

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don't already…

cobusgreyling.medium.com

COBUS GREYLING

Where AI Meets Language | Language Models, AI Agents, Agentic Applications, Development Frameworks & Data-Centric…

cobusgreyling.com