Anthropic’s Research on GenAI Building Blocks
Anthropic’s Latest Research On Building Effective AI Agents
Introduction
Anthropic recently shared insights from their experience with customers in building AI Agents, offering practical advice for builders on creating effective agents.
Their piece serves as a sobering reminder amidst the current hype.
Their key observation is that the most successful implementations often avoid complex frameworks or specialised libraries, instead favouring simple, composable patterns.
When building with LLMs, they suggest starting with the simplest solution and increasing complexity only when necessary.
Sometimes, this means not building agentic systems at all, as these systems often trade latency and cost for better task performance, a tradeoff that should be carefully considered.
Over the past year, we’ve worked with dozens of teams building large language model (LLM) agents across industries. Consistently, the most successful implementations weren’t using complex frameworks or specialised libraries. Instead, they were building with simple, composable patterns. ~ Anthropic
Anthropic highlights that workflows provide predictability for well-defined tasks, while agents excel in scenarios requiring flexibility and large-scale decision-making.
For many applications, optimising single LLM calls with retrieval and in-context examples suffices.
They caution that if frameworks are used, a deep understanding of the underlying code is crucial to avoid errors due to incorrect assumptions.
Agents vs Workflows
The term Agent can be interpreted in multiple ways.
Some view AI Agents as fully autonomous systems that operate independently for extended periods, utilising various tools to handle complex tasks.
Others define AI Agents as more prescriptive implementations that adhere to predefined workflows.
Anthropic group these variations under the umbrella of Agentic Systems but make a key architectural distinction between workflows and agents:
Workflows involve systems where LLMs and tools are orchestrated through predefined code paths.
Agents, by contrast, are systems where LLMs dynamically manage their own processes and tool usage, maintaining control over how tasks are accomplished.
If you do use a framework, ensure you understand the underlying code. Incorrect assumptions about what’s under the hood are a common source of customer error. ~ Anthropic
Suggested Approach
Simplify First
Start with the simplest solution when building applications with LLMs.
Avoid Unnecessary Complexity
Increase complexity only when necessary; often, Agentic systems may not be required.
Agentic Systems Trade-offs
Agentic systems enhance task performance but can increase latency and cost.
Workflows or Agents?
Workflows: Suitable for predictability and consistency in well-defined tasks.
Trade-offs: Ideal for flexibility and model-driven decision-making at scale.
Optimisation Strategy
In many cases, optimising single LLM calls with retrieval and in-context examples is sufficient.
Building Blocks, Workflows & Agents
Augmented LLM
The core of agentic systems is an LLM enhanced with retrieval, tools and memory. Current models can generate search queries, choose tools, and decide what information to retain.
Prompt Chaining
Prompt chaining is where a human creates a flow, by decomposing a task into a sequence of steps or sub-tasks. Each LLM call processes the output of the previous one.
Builders can add intermediate steps and checks to ensure that the process is still on track.
Routing Workflows
Routing classifies input and directs it to specialised tasks, enabling separation of concerns and more tailored prompts.
Without routing, optimising for one input type may degrade performance on others.
Ideal for complex tasks with distinct categories, where accurate classification can be handled by an LLM or traditional models.
Parallelisation
In certain instances, LLMs can work simultaneously on a task and have their outputs aggregated programmatically.
Parallelisation is effective when the divided subtasks can be parallelised for speed or when multiple perspectives or attempts are needed for higher confidence results.
For complex tasks with multiple considerations, LLMs generally perform better when each consideration is handled by a separate LLM call, allowing focused attention on each specific aspect.
Orchestrator-Workers
In the orchestrator-workers workflow, a central LLM dynamically breaks down tasks, delegates them to worker LLMs and synthesizes their results.
Evaluator-Optimiser
One LLM generates a response, while another evaluates and provides feedback in a loop.
Best for tasks with clear evaluation criteria and where iterative refinement adds value. Ideal when human-like feedback improves responses and the LLM can provide similar feedback, akin to a writer refining a draft.
AI Agents
As LLMs advance in understanding, reasoning, planning, and tool use, AI Agents are becoming more common in production.
AI Agents start with a command or discussion with a user, then plan and operate independently, returning to the user as needed.
They rely on “ground truth” (e.g., tool results or code execution) at each step to assess progress and may pause for human feedback at checkpoints or when blocked. Tasks end upon completion or when stopping conditions, like iteration limits, are met.
AI Agents are simply LLMs using tools in a feedback loop. Clear tool design and documentation are crucial.
In Conclusion
Anthropic’s research offers a refreshing and rational perspective…
Anthropic’s research emphasises practical, use-case-focused solutions that are as simple as possible, with explainability, observability and inspectability.
It warns against adopting frameworks without understanding their inner workings, as this can lead to unexpected behaviour.
While AI agents are useful, they are not always the best solution —
Workflows can often be more appropriate.
Companies may push specific frameworks to sell their stacks without keeping best practice in mind. Or compromise on taking the most optimised and appropriate route to reach business objectives.
Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.