AI Agents With Planning & Execution Supervision
Balancing Autonomy & Control in a Rapidly Evolving Agentic Landscape
AI Agent Technology Market Segmentation
I first would like to consider the current AI Agent market segmentation in terms of how three market segments are emerging…
Here’s a breakdown of key AI Agent technology approaches and how to address the market.
1. OS Providers
Operating System (OS) providers, such as Microsoft or Apple, can integrate AI Agent capabilities directly and natively into their platforms.
This integration allows for seamless and optimised performance within native environments.
Because these AI Agents are embedded at the OS level, they benefit from system compatibility and enhanced security.
This approach ensures smooth interaction with the OS’s features and applications and has as its advantage the option to leverage the existing user context on the device or PC.
2. Traditional Cloud Providers
Cloud service providers like AWS, Azure or Google Cloud offer AI Agent solutions that can be deployed within existing enterprise cloud infrastructure.
Enterprises adopting these solutions often do so to leverage existing vendor relationships/lock-in and achieve cost efficiency.
While these cloud-based AI Agents might not always be the most innovative or specialised, they provide convenient scalability and integration with existing enterprise software already in use.
This approach suits businesses seeking streamlined deployment over best-in-class solutions.
3. Third-Party Specialised Integration
Specialised third-party providers focus on delivering best-in-class AI Agent solutions tailored to specific industries or use cases.
These solutions often prioritise innovation and customisation, providing advanced features and flexibility that general-purpose offerings may lack.
However, integrating these specialised AI Agents with existing enterprise systems can be complex and may require additional resources for seamless operation.
This approach is ideal for organisations seeking cutting-edge capabilities tailored to their unique needs.
AI Agents With GUI Capabilities
The rise of AI Agents capable of interacting with Graphical User Interfaces (GUIs) is transforming how tasks are automated and performed in digital environments.
These AI Agents simulate human-like interactions, such as clicking, typing, and navigating software, offering a new level of automation.
However, not all AI Agent solutions are created equal. Depending on the underlying technology, the approach to AI agent GUI navigation can differ significantly.
The Security Challenges of High-Agency AI Agents
As AI Agents become more advanced, their ability to operate autonomously introduces significant security risks.
Agents with a high level of agency — the ability to make independent decisions and take actions — can be vulnerable to various threats.
One notable risk arises when agents have vision capabilities and interact with dynamic environments, such as navigating websites.
Recent research highlights the threat of adversarial attacks via nefarious pop-ups.
For instance, while an AI Agent browses the web, malicious pop-ups designed to mislead the AI Agent can appear. These deceptive elements can manipulate the AI Agent into taking harmful actions, such as downloading malware or sharing sensitive information.
This mirrors the early challenges humans faced in understanding and recognising phishing attacks. Just as people had to learn to avoid these traps, AI Agents must develop similar safeguards.
A Safer Path to Agentic Capabilities
Planning & Execution Supervision
To mitigate these risks, one promising approach is introducing human supervision for AI Agent planning and execution.
A process where AI Agents operate with human oversight. With this approach, the AI Agent assists in task decomposition, planning and creating a sequence of events. However, the AI Agent relies on user input to finalise decisions, ensuring safety and accuracy.
Example of Human Supervision in Action
Considering the image below, a scenario is shown where a user makes the following request to an AI agent:
I will be on PTO tomorrow. Please reschedule my meeting with George to the day after tomorrow. Also, list the tasks assigned to me that are due tomorrow, summarise them, and email the summary to Alex.
Using an Agentic UI, the AI Agent can break down the request into a sequence of actionable steps and present them to the user for approval:
- Reschedule your meeting with George to the day after tomorrow.
- Search for tasks assigned to you that are due tomorrow.
- Summarise the tasks identified.
- Send an email to Alex with the task summary.
In this approach, the user can review, approve, edit, or remove any steps before execution. This method ensures that the AI Agent remains under control, reducing the risk of errors or unintended actions.
Conclusion
As AI Agents continue to advance, their ability to autonomously interact with digital environments brings both opportunities and challenges.
Supplier segmentation highlights different approaches — from OS providers to cloud services and specialised third parties — each with distinct advantages.
However, the rise of highly autonomous AI Agents also introduces security threats that require careful consideration.
Human supervision of tasks offers a balanced solution, enabling AI Agents to maintaining productivity while leveraging human oversight to ensure safety.
Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.