Agent Networks
LLM Agents rely on natural language for input and output & creating a network of agents is becoming increasingly feasible.
A while back I read an interesting article by Boost AI on how they created a virtual agent network in partnership with the Finnish government.
This is the concept of creating super agents or meta agents which act as aggregators of bot services.
With the advent of Google Home and Amazon Alexa, these devices were seen as possible meta bots becoming a single point of user contact. And the idea was that in turn these meta bots will call on skills, actions, etc. to fulfil the user intent.
As we all know by now, the anticipated ascent of smart speakers largely failed.
For me, this brings to mind the demise of VoiceXML which was touted as a standard to voice-enable websites and create a universal speech interface leveraging the World Wide Web.
Something the internet has taught us is that networks work best if there are open protocols in place with a decentralised approach.
⭐️ Please follow me on LinkedIn for updates on LLMs ⭐️
Agents make use of natural language for input and output.
The natural language input can be convoluted, ambiguous and cryptic, yet the LLM based Agent has the ability to decompose the question into a chain-of-thought and answer the question in a piecemeal fashion.
Added to this, the Agents have a very natural and conversational style output of data; as seen below in the output of a LangChain based Agent:
Natural Langauge Generation is a Meta Capability Of Large Language Models & Prompt Engineering is the key component to accessing LLMs.
So How Would A LLM Agent Network Look?
Firstly, there are a few considerations when making an LLM Agent available, and especially publicly.
Below are listed a few considerations…
- There is always the danger of malicious use and nefarious attacks on the agent. Especially in the form of prompt injection attacks.
ChatML gives an opportunity to mitigate and eventually solve injections, as the model can tell which instructions come from the developer, the user, or its own input.
- One or more LLMs create the foundation of Agents. Considerations with LLMs are latency, regional availability, cost and data privacy considerations.
- Cost in general. An agent is underpinned by tools the agent orchestrates to reach a conclusion. Most of these APIs come at a cost, and if an agent potentially leverages multiple agents to reach a conclusion, the cost can be vast.
- A prudent approach would be to create a network of agents first within an organisation or enterprise.
- It would also make sense to start with a prompt pipeline approach which leverages an open-source stack like Haystack and a document store.
Semantic Search As A Meta Agent Enabler
A controlled and measured point of departure to creating meta or an Agent Network is scaling on the Tools level.
A meta agent can be created which have access to a host of Tools. Each of these Tools can have a description associated with it. Based on Semantic Search, the most appropriate tool or tools can be selected to service the user request.
The tools are accessed dynamically at run time and the required number of tools are selected to cycle through.
Even though an Agent can access other agents, it seems more contained and manageable to rather scale on a Tools level; to start with at least.
In the notebook example below, there is one applicable tool and 10 other dummy tools. There is a step in the prompt template that takes the user input and retrieve tools relevant to the query.
This working code example makes use of a vector store to create embeddings for each tool description.
For an incoming query embeddings are created for the query and the Agent does a similarity search for relevant tools.
pip install langchain
pip install google-search-results
pip install openai
pip install tiktoken
pip install faiss-cpu
from langchain.agents import Tool, AgentExecutor, LLMSingleActionAgent, AgentOutputParser
from langchain.prompts import StringPromptTemplate
from langchain import OpenAI, SerpAPIWrapper, LLMChain
from langchain.chat_models import ChatOpenAI
from typing import List, Union
from langchain.schema import AgentAction, AgentFinish
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
import re
llm = ChatOpenAI(temperature=0.0)
math_llm = OpenAI(temperature=0.0)
# Define which tools the agent can use to answer user queries
search = SerpAPIWrapper()
tools = [
Tool(
name = "Search",
func=search.run,
description="useful for when you need to answer questions about current events"
)
]
import os
import openai
os.environ['OPENAI_API_KEY'] = str("xxxxxxxxx")
os.environ["SERPAPI_API_KEY"] = str("xxxxxxxxx")
llm = OpenAI(temperature=0,model_name='gpt-4-0314')
# Define which tools the agent can use to answer user queries
search = SerpAPIWrapper()
search_tool = Tool(
name = "Search",
func=search.run,
description="useful for when you need to answer questions about current events"
)
def fake_func(inp: str) -> str:
return "foo"
fake_tools = [
Tool(
name=f"foo-{i}",
func=fake_func,
description=f"a silly function that you can use to get more information about the number {i}"
)
for i in range(10)
]
ALL_TOOLS = [search_tool] + fake_tools
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.schema import Document
docs = [Document(page_content=t.description, metadata={"index": i}) for i, t in enumerate(ALL_TOOLS)]
vector_store = FAISS.from_documents(docs, OpenAIEmbeddings())
retriever = vector_store.as_retriever()
def get_tools(query):
docs = retriever.get_relevant_documents(query)
return [ALL_TOOLS[d.metadata["index"]] for d in docs]
get_tools("whats the weather?")
get_tools("whats the number 13?")
Set up the base template
template = """Answer the following questions as best you can, but speaking as a politian might speak. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin! Remember to speak as a pirate when giving your final answer. Use lots of "Arg"s
Question: {input}
{agent_scratchpad}"""
from typing import Callable
# Set up a prompt template
class CustomPromptTemplate(StringPromptTemplate):
# The template to use
template: str
############## NEW ######################
# The list of tools available
tools_getter: Callable
def format(self, **kwargs) -> str:
# Get the intermediate steps (AgentAction, Observation tuples)
# Format them in a particular way
intermediate_steps = kwargs.pop("intermediate_steps")
thoughts = ""
for action, observation in intermediate_steps:
thoughts += action.log
thoughts += f"\nObservation: {observation}\nThought: "
# Set the agent_scratchpad variable to that value
kwargs["agent_scratchpad"] = thoughts
############## NEW ######################
tools = self.tools_getter(kwargs["input"])
# Create a tools variable from the list of tools provided
kwargs["tools"] = "\n".join([f"{tool.name}: {tool.description}" for tool in tools])
# Create a list of tool names for the tools provided
kwargs["tool_names"] = ", ".join([tool.name for tool in tools])
return self.template.format(**kwargs)
prompt = CustomPromptTemplate(
template=template,
tools_getter=get_tools,
# This omits the `agent_scratchpad`, `tools`, and `tool_names` variables because those are generated dynamically
# This includes the `intermediate_steps` variable because that is needed
input_variables=["input", "intermediate_steps"]
)
class CustomOutputParser(AgentOutputParser):
def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:
# Check if agent should finish
if "Final Answer:" in llm_output:
return AgentFinish(
# Return values is generally always a dictionary with a single `output` key
# It is not recommended to try anything else at the moment :)
return_values={"output": llm_output.split("Final Answer:")[-1].strip()},
log=llm_output,
)
# Parse out the action and action input
regex = r"Action\s*\d*\s*:(.*?)\nAction\s*\d*\s*Input\s*\d*\s*:[\s]*(.*)"
match = re.search(regex, llm_output, re.DOTALL)
if not match:
raise ValueError(f"Could not parse LLM output: `{llm_output}`")
action = match.group(1).strip()
action_input = match.group(2)
# Return the action and action input
return AgentAction(tool=action, tool_input=action_input.strip(" ").strip('"'), log=llm_output)
output_parser = CustomOutputParser()
llm = OpenAI(temperature=0)
# LLM chain consisting of the LLM and a prompt
llm_chain = LLMChain(llm=llm, prompt=prompt)
tool_names = [tool.name for tool in tools]
agent = LLMSingleActionAgent(
llm_chain=llm_chain,
output_parser=output_parser,
stop=["\nObservation:"],
allowed_tools=tool_names
)
agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True)
agent_executor.run("What's the weather in SF?")
And the output below:
> Entering new AgentExecutor chain...
Thought: I should find out what the current weather is in SF.
Action: Search
Action Input: Weather in SF
Observation:Partly cloudy in the evening. Increasing clouds with periods of showers after midnight. Low 49F. Winds SW at 10 to 20 mph. Chance of rain 50%.
I now know the final answer.
Final Answer: Arg, 'tis partly cloudy with a chance of showers after midnight. Winds be blowin' at 10 to 20 mph.
> Finished chain.
Arg, 'tis partly cloudy with a chance of showers after midnight. Winds be blowin' at 10 to 20 mph.
In Closing
The future production implementation methods of LLMs are taking shape and pro-code frameworks like LangChain Agents are sure to be followed by an increasing number of graphic development interfaces.
The focus on prompt engineering is immense, but as I have mentioned in numerous previous posts, templating will become a default implementation avenue. With Prompt Chaining and Agents taking centre stage in terms of LLM Applications.
⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️
I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.