Assertions Are Like Guardrails for LLM Apps

DSPy Assertions are a different approach to guardrails, which asserts computational constraints on foundation models.

5 min readJun 3, 2024

Introduction

In a previous post I gave some background on what the basic architecture of DSPy is and what some of the possible use-cases might be.

Exploration & Optimisation

DSPy can be well suited as an interface to describe your needs, share a very small amount of data, and have DSPy generate the optimal prompts, prompt templates and prompting strategies.

To get better results without spending a fortune, you should try different approaches like breaking tasks into smaller parts, refining prompts, increasing data, fine-tuning, and opting for smaller models. The real magic happens when these methods work together, but tweaking one can impact the others.

GUI

DSPy is a programatic approach and I can just imagine how DSPy will benefit from a GUI for more basic implementations. Consider a user can upload sample data, describe in natural language what they want to achieve, and then have via the GUI a prompting strategy generated with templates etc.

Use-Case

Deciding if DSPy is the right fit for your implementation, the use-case needs to be considered. And this goes for all implementations, the use-case needs to lead.

In essence, DSPy is designed for scenarios where you require a lightweight, self-optimising programming model rather than relying on pre-defined prompts and integrations.

Brief Assessment of Assertions

I believe much can be gleaned from this implementation of guardrails…

The guardrails can be described in natural language and the LLM can be leveraged to self-check its responses.
More complicated statements can be created in Python where values are parsed to perform checks.
The flexibility of describing the guardrails lend a high order of flexibility in what can be set for specific implementations.
The division between assertions and suggestions is beneficial, as it allows for a clearer delineation of checks.
Additionally, the ability to define recourse adds another layer of flexibility and control to the process.
The study’s language primarily revolves around constraining the LLM and defining runtime retry semantics.
This approach also serves as an abstraction layer for self-refinement methods into arbitrary steps for pipelines.

Hard & Soft Assertions

There are two types of assertions, hard and soft.

Hard Assertions represent critical conditions that, when violated after a maximum number of retries, cause the LM pipeline to halt, if so defined, signalling a non-negotiable breach of requirements.

On the other hand, suggestions denote desirable but non-essential properties; their violation triggers the self-refinement process, but exceeding a maximum number of retries does not halt the pipeline. Instead, the pipeline continues to execute the next module.

dspy.Assert(your_validation_fn(model_outputs), "your feedback message", target_module="YourDSPyModuleSignature")

dspy.Suggest(your_validation_fn(model_outputs), "your feedback message", target_module="YourDSPyModuleSignature")

Assertions makes use of DSPy as a foundational framework.

Practical Example

Considering the code snippet below from a DSPy notebook:

class LongFormQAWithAssertions(dspy.Module):
    def __init__(self, passages_per_hop=3, max_hops=2):
        super().__init__()
        self.generate_query = [dspy.ChainOfThought(GenerateSearchQuery) for _ in range(max_hops)]
        self.retrieve = dspy.Retrieve(k=passages_per_hop)
        self.generate_cited_paragraph = dspy.ChainOfThought(GenerateCitedParagraph)
        self.max_hops = max_hops
    
    def forward(self, question):
        context = []
        for hop in range(self.max_hops):
            query = self.generate_query[hop](context=context, question=question).query
            passages = self.retrieve(query).passages
            context = deduplicate(context + passages)
        pred = self.generate_cited_paragraph(context=context, question=question)
        pred = dspy.Prediction(context=context, paragraph=pred.paragraph)
        dspy.Suggest(citations_check(pred.paragraph), f"Make sure every 1-2 sentences has citations. If any 1-2 sentences lack citations, add them in 'text... [x].' format.", target_module=GenerateCitedParagraph)
        _, unfaithful_outputs = citation_faithfulness(None, pred, None)
        if unfaithful_outputs:
            unfaithful_pairs = [(output['text'], output['context']) for output in unfaithful_outputs]
            for _, context in unfaithful_pairs:
                dspy.Suggest(len(unfaithful_pairs) == 0, f"Make sure your output is based on the following context: '{context}'.", target_module=GenerateCitedParagraph)
        else:
            return pred
        return pred

The assertions included aims to enforce the defined computational constraints, allowing the LongFormQA program to operate within these guidelines automatically.

In the first assertion, the program validates the output paragraph to ensure citations appear every 1–2 sentences. If this validation fails, the assertion backtracking logic activates with the feedback: “Ensure each 1–2 sentences include citations in ‘text… [x].’ format.”

In the second assertion, the CheckCitationFaithfulness program is used to verify the accuracy of each cited reference, examining text segments in the generated paragraph.

For unfaithful citations, it provides feedback with the context: “Ensure your output aligns with this context: ‘{context}’.”

This ensures the assertion backtracking has the necessary information and context.

⭐️ Follow me on LinkedIn for updates on Large Language Models ⭐️

I’m currently the Chief Evangelist @ Kore AI. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don’t already…

cobusgreyling.medium.com

Cobus Greyling | Substack

I explore and write about all things at the intersection of AI and language; LLMs, NLP/NLU, Chat/Voicebots, CCAI…

substack.com

COBUS GREYLING

At the intersection of AI & Language | NLP/NLU/LLM, Chat/Voicebots, CCAI I explore and write about all things at the…

www.cobusgreyling.com

dspy/examples/longformqa/longformqa_assertions.ipynb at main · stanfordnlp/dspy

DSPy: The framework for programming-not prompting-foundation models …

github.com