Fully Autonomous AI Agents Should Not Be Developed
Research from HuggingFace…and they coined a new definition for what an AI Agent is.
Once in a while a piece of research comes past which is just such a breath of fresh air and brings some sense of sanity. This paper from HuggingFace is one such piece.
In general commentary from HuggingFace is well balanced and a voice of reason amongst all of the hype. This research on AI Agents breaks the concept down to its basics, it coins a definition for AI Agents, and considers agency as apposed to autonomy.
Introduction
This research from HuggingFace considers the dangers of AI Agents…and in all honesty, there have only been a small number of studies (which I could find) on the dangers and risks of AI Agents.
An area that is especially lacking in consideration is the aspect of nefarious attacks on AI Agents, especially AI Agents navigating the web is susceptible to attacks from pop-ups, misleading links, etc.
There has also been a transition and a fundamental shift to systems which are capable of creating specific plans in non-deterministic environments.
The tail end of 2024 saw “AI Agents”, autonomous goal-directed systems, begin to be marketed and deployed as the next big advancement in AI technology.
What Is An AI Agent
Understanding the potential benefits and risks of AI agents requires clarity on what an AI Agent is, but definitions vary widely.
In AI, the term agent can refer to anything from simple prompt-and-response systems to complex, multi-step customer support tools.
To better grasp what an AI Agent is, the study reviewed existing AI Agents, platforms and historical literature on their potential.
With focus on what is central (the person, system, or workflow), the clarity of language, and the types of systems described (e.g., distinguishing autonomous from automatic systems).
Definition of an AI Agent:
Computer software systems capable of creating context-specific plans in non-deterministic environments.
AI Agents can act with some level of autonomy. Given a goal, they can decompose it into subtasks and execute each one of them without direct human intervention
Levels of AI Agents are defined as systems using machine-learned models and can have different levels of agency.
They can also be combined in multi-agent systems, where one AI Agent workflow triggers another, or multiple agents work collectively toward a goal.
Agency
The philosophical basis for attributing agency to AI remains uncertain, raising questions about whether AI can truly possess it.
This has two key implications:
- The risks associated with higher levels of agency are not counterbalanced by a strong philosophical justification for its benefits.
- Instead of debating whether AI possesses true agency — a concept rooted in philosophy and often associated with intentionality, free will, and moral responsibility — it may be more practical to assess AI in terms of autonomy.
Autonomy refers to the ability of an AI system to operate independently within predefined constraints, make decisions based on its programming and learning processes, and adapt to new inputs without direct human intervention.
Looking Forward
Looking forward, this suggests several critical directions for the development of AI agents:
Adoption of Agent Levels
There should be widespread adoption of clear distinctions between levels of agent autonomy. This would enable developers and users to better understand system capabilities and the associated risks.
Human Control Mechanisms
It is essential to develop robust frameworks, both technical and policy-based, that ensure meaningful human oversight while preserving beneficial semi-autonomous functionality. This includes creating reliable override systems and establishing clear operational boundaries for AI Agents.
Safety Verification
New methods must be created to verify that AI Agents remain within intended operating parameters and cannot override human-specified constraints.
The development of AI Agents represents a critical inflection point in artificial intelligence. History has shown that even well-engineered autonomous systems can make catastrophic errors due to trivial causes.
While increased autonomy can provide significant benefits in specific contexts, human judgment and contextual understanding remain indispensable, particularly for high-stakes decisions. Access to the environments in which an AI agent operates is crucial, empowering humans to intervene and say “no” when a system’s autonomy diverges from human values and goals.
Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.