Meta-Prompt

Meta-Prompt for building self-improving agents with a single universal prompt.

6 min readJul 14, 2023

I’m currently the Chief Evangelist @ HumanFirst. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

The key principle underpinning Meta-Prompting is to cause the agent to reflect on its own performance and amend its own instructions accordingly.

While simultaneously using one overarching meta-prompt.

I need to note that there is also the notion of creating a meta-prompt via a process of prompt tuning which is a more opaque and technical process.

The agent uses a looping process which starts with no instructions and follows these steps:

Engage in conversation with a user, who may provide requests, instructions, or feedback.
At the end of an episode, generate self-criticism and a new instruction using the meta-prompt.

Below is the only fixed instructions submitted to the agent, which can be called the Meta-Prompt.

Between iterations, the agent only has memory of the instructions it modifies and applies to itself.

Assistant has just had the below interactions with a User. Assistant followed their "system: Instructions" closely. Your job is to critique the Assistant's performance and then revise the Instructions so that Assistant would quickly and correctly respond in the future.
 
####
{hist}
####
 
Please reflect on these interactions.

You should first critique Assistant's performance. What could Assistant have done better? What should the Assistant remember about this user? Are there things this user always wants? Indicate this with "Critique: ...".

You should next revise the Instructions so that Assistant would quickly and correctly respond in the future. Assistant's goal is to satisfy the user in as few interactions as possible. Assistant will only see the new Instructions, not the interaction history, so anything important must be summarized in the Instructions. Don't forget any important details in the current Instructions! Indicate the new Instructions by "Instructions: ...".

Two chains are defined, one serving as the Assistant, and the other is a meta-chain which critiques the Assistant performance and modifies the instructions to the Assistant.

pip install langchain
pip install openai


import os
import openai
os.environ['OPENAI_API_KEY'] = str("xxxxxxxxxxxxxxxxxxxxxxxxx")

from langchain import OpenAI, LLMChain, PromptTemplate
from langchain.memory import ConversationBufferWindowMemory

Submit the following code block in the notebook, with the prompt defined.

def initialize_chain(instructions, memory=None):
    if memory is None:
        memory = ConversationBufferWindowMemory()
        memory.ai_prefix = "Assistant"

    template = f"""
    Instructions: {instructions}
    {{{memory.memory_key}}}
    Human: {{human_input}}
    Assistant:"""

    prompt = PromptTemplate(
        input_variables=["history", "human_input"], template=template
    )

    chain = LLMChain(
        llm=OpenAI(temperature=0),
        prompt=prompt,
        verbose=True,
        memory=ConversationBufferWindowMemory(),
    )
    return chain


def initialize_meta_chain():
    meta_template = """
    Assistant has just had the below interactions with a User. Assistant followed their "Instructions" closely. Your job is to critique the Assistant's performance and then revise the Instructions so that Assistant would quickly and correctly respond in the future.

    ####

    {chat_history}

    ####

    Please reflect on these interactions.

    You should first critique Assistant's performance. What could Assistant have done better? What should the Assistant remember about this user? Are there things this user always wants? Indicate this with "Critique: ...".

    You should next revise the Instructions so that Assistant would quickly and correctly respond in the future. Assistant's goal is to satisfy the user in as few interactions as possible. Assistant will only see the new Instructions, not the interaction history, so anything important must be summarized in the Instructions. Don't forget any important details in the current Instructions! Indicate the new Instructions by "Instructions: ...".
    """

    meta_prompt = PromptTemplate(
        input_variables=["chat_history"], template=meta_template
    )

    meta_chain = LLMChain(
        llm=OpenAI(temperature=0),
        prompt=meta_prompt,
        verbose=True,
    )
    return meta_chain


def get_chat_history(chain_memory):
    memory_key = chain_memory.memory_key
    chat_history = chain_memory.load_memory_variables(memory_key)[memory_key]
    return chat_history


def get_new_instructions(meta_output):
    delimiter = "Instructions: "
    new_instructions = meta_output[meta_output.find(delimiter) + len(delimiter) :]
    return new_instructions

def main(task, max_iters=3, max_meta_iters=5):
    failed_phrase = "task failed"
    success_phrase = "task succeeded"
    key_phrases = [success_phrase, failed_phrase]

    instructions = "None"
    for i in range(max_meta_iters):
        print(f"[Episode {i+1}/{max_meta_iters}]")
        chain = initialize_chain(instructions, memory=None)
        output = chain.predict(human_input=task)
        for j in range(max_iters):
            print(f"(Step {j+1}/{max_iters})")
            print(f"Assistant: {output}")
            print(f"Human: ")
            human_input = input()
            if any(phrase in human_input.lower() for phrase in key_phrases):
                break
            output = chain.predict(human_input=human_input)
        if success_phrase in human_input.lower():
            print(f"You succeeded! Thanks for playing!")
            return
        meta_chain = initialize_meta_chain()
        meta_output = meta_chain.predict(chat_history=get_chat_history(chain.memory))
        print(f"Feedback: {meta_output}")
        instructions = get_new_instructions(meta_output)
        print(f"New Instructions: {instructions}")
        print("\n" + "#" * 80 + "\n")
    print(f"You failed! Thanks for playing!")

Task can be defined with any question or argument for the task, as seen below:

task = "Provide a systematic argument for why we should always eat pasta with olives."
main(task)

In closing, what makes LangChain such an excellent resource is the fact that many papers which are currently available are created in the LangChain resources. These include ReAct, Function Calling and more.

⭐️ Follow me on LinkedIn for updates on Conversational AI ⭐️

NLU design tooling

HumanFirst is data-centric tooling for NLU designers. Create, curate, evaluate & fine-tune long-tail NLU with 50+ NLU…

www.humanfirst.ai

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don’t already…

cobusgreyling.medium.com

Meta-Prompt | 🦜️🔗 Langchain

This is a LangChain implementation of Meta-Prompt, by Noah Goodman, for building self-improving agents.

python.langchain.com

MetaPrompting: Learning to Learn Better Prompts

Yutai Hou, Hongyuan Dong, Xinghao Wang, Bohan Li, Wanxiang Che. Proceedings of the 29th International Conference on…

aclanthology.org

MetaPrompting: Learning to Learn Better Prompts

Prompting method is regarded as one of the crucial progress for few-shot nature language processing. Recent research on…

arxiv.org

Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models

Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to…

arxiv.org

The Power of Scale for Parameter-Efficient Prompt Tuning

In this work, we explore "prompt tuning", a simple yet effective mechanism for learning "soft prompts" to condition…

arxiv.org

Learning How to Ask: Querying LMs with Mixtures of Soft Prompts

Guanghui Qin, Jason Eisner. Proceedings of the 2021 Conference of the North American Chapter of the Association for…

aclanthology.org

Meta-Prompt: A Simple Self-Improving Language Agent

The concept of self-improving systems captures our imagination. Consider Isaac Asimov's Multivac and Arthur C. Clarke's…

noahgoodman.substack.com

Meta-Prompt

Meta-Prompt for building self-improving agents with a single universal prompt.

NLU design tooling

HumanFirst is data-centric tooling for NLU designers. Create, curate, evaluate & fine-tune long-tail NLU with 50+ NLU…

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don’t already…

Meta-Prompt | 🦜️🔗 Langchain

This is a LangChain implementation of Meta-Prompt, by Noah Goodman, for building self-improving agents.

MetaPrompting: Learning to Learn Better Prompts

Yutai Hou, Hongyuan Dong, Xinghao Wang, Bohan Li, Wanxiang Che. Proceedings of the 29th International Conference on…

MetaPrompting: Learning to Learn Better Prompts

Prompting method is regarded as one of the crucial progress for few-shot nature language processing. Recent research on…

Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models

Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to…

The Power of Scale for Parameter-Efficient Prompt Tuning

In this work, we explore "prompt tuning", a simple yet effective mechanism for learning "soft prompts" to condition…

Learning How to Ask: Querying LMs with Mixtures of Soft Prompts

Guanghui Qin, Jason Eisner. Proceedings of the 2021 Conference of the North American Chapter of the Association for…

Meta-Prompt: A Simple Self-Improving Language Agent

The concept of self-improving systems captures our imagination. Consider Isaac Asimov's Multivac and Arthur C. Clarke's…

Written by Cobus Greyling

No responses yet