Sitemap
(Left) Traditional agents view the environment as partially observable, using complex tools and pipelines with LMs to gather info for tasks. (Right) the study use advanced long-context LMs for state-in-context agents, removing complex scaffolding. These agents keep the full environment state in the LM’s context, turning open-ended tasks into straightforward ones LMs handle well.

Conversational UIs Demand Context Like Never Before

A recent study unveils state-in-context AI Agents, harnessing large language models (LLMs) to streamline software engineering by leveraging expansive context windows.

--

In Brief

State-in-context AI Agents use LLMs to simplify software engineering tasks, excelling in vertical, task-specific applications like code repository interactions.

They’re less effective for AI Agents navigating multiple sources, web interfaces, or operating systems.

The study highlights a key trade-off: offloading functionality to LLMs increases dependency on their behaviour, while scaffolding offers granular control.

Per Anthropic, the most effective implementations favour simple, composable patterns over complex frameworks or specialised libraries.

A Lean Approach to Large Codebases

State-in-context AI Agents tackle sprawling code repositories efficiently:

  1. They compress codebases by ranking files for relevance.
  2. A large-context LLM (LCLM) processes key files to draft solutions.
  3. A short-context LLM (SCLM) refines the output into polished, usable code.

This high-level method — reading select files and generating solutions — integrates seamlessly with tools like Git, avoiding tedious line-by-line edits.

The study outlines how state-in-context AI Agents are used for software engineering. When the code repository’s size exceeds the context limit of large-context language models (LCLMs), a compression method ranks files by relevance and includes only the most relevant ones within the limit. State-in-context agents are implemented in two ways: DIRECTSOLVE uses LCLMs to process the compressed repository and generate a solution directly, which is then passed to short-context language models (SCLMs) to produce the final output, taking advantage of SCLMs’ strong problem-solving skills.

Workflows: Predefined code paths orchestrate LLMs and tools for predictable outcomes.

Agents: LLMs dynamically control processes and tool usage, offering flexibility but less predictability.
~ Anthropic

Practical Takeaways

Start lean

Use LLM APIs directly for tasks like file ranking or solution generation. A few lines of code can suffice, avoiding bloated frameworks.

Business benefits

Faster prototyping, lower maintenance costs.

Framework caution

If frameworks are used, understand their internals to avoid errors from hidden assumptions, as Anthropic advises.

Balancing LLM Reliance & Flexibility

While AI Agents lean on LLMs for code comprehension and problem-solving, over-dependence risks errors if models falter or become outdated.

Anthropic’s success with minimal scaffolding shows LLMs can handle complex tasks, but the LCLM-SCLM two-stage approach adds adaptability.

To stay LLM-agnostic: Standardise inputs (compressed codebases) and outputs (code formats) to enable model swapping.

Practical tip

Pair LLM outputs with validation tools or tests to catch errors, reducing reliance on model quirks.

This ensures cost savings and future-proofs workflows by allowing swaps — e.g., from a pricey LCLM to a cheaper alternative.

Where Complexity Lives

The study shifts complexity to LLMs, keeping scaffolding light with basic compression and workflows.

This taps into LLMs’ code-comprehension strengths, but Anthropic warns against over-engineered frameworks.

Simple workflows (predefined paths) are favoured over dynamic agents, as the latter’s exploration of environments like codebases can be hard to control.

Key insight: Complexity is inevitable — place it wisely to balance simplicity, cost, efficiency, and speed.

This approach aligns with Anthropic’s push for composable, adaptable systems that evolve with AI advancements.

Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.

--

--

Cobus Greyling
Cobus Greyling

Written by Cobus Greyling

I’m passionate about exploring the intersection of AI & language. www.cobusgreyling.com

No responses yet