AI Agents
AI Agents, also referred to as Agentic Applications or Agents, represent a significant leap in AI Development by enabling autonomous decision-making, task execution, and exploration.
In the image below is a selection of really impactful studies…
These agents are capable of complex operations such as web exploration and system navigation, leveraging multi-modal models that combine text, speech, and visual data for a comprehensive understanding of their environment. In the context of agentic exploration, AI agents can independently navigate digital environments, such as browser-based interfaces or mobile operating systems, to fulfill user-defined objectives.
As AI Agents evolve, they bridge the gap between purely reactive tools and proactive, adaptive systems that can learn, explore, and improve autonomously.
Considering recent research from 𝗔𝗽𝗽𝗹𝗲, 𝗜𝗕𝗠, 𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗮𝗻𝗱 𝗼𝘁𝗵𝗲𝗿𝘀…
…The 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗹𝗮𝗻𝗱𝘀𝗰𝗮𝗽𝗲 is rapidly evolving with significant advancements in development interfaces, enabling seamless integration of 𝗺𝘂𝗹𝘁𝗶-𝗺𝗼𝗱𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 for more dynamic interactions.
Tools and frameworks are being optimised for testing and benchmarking, allowing developers to gauge agent performance across different tasks and environments.
AI Agents are now being designed to operate within two distinct 𝘦𝘤𝘰𝘴𝘺𝘴𝘵𝘦𝘮𝘴: browser-based systems, which enable 𝘄𝗲𝗯 𝗲𝘅𝗽𝗹𝗼𝗿𝗮𝘁𝗶𝗼𝗻, and 𝗽𝗵𝗼𝗻𝗲 𝗢𝗦-𝗯𝗮𝘀𝗲𝗱 environments, providing a mobile-centric interface for task execution.
As the technology progresses, multi-modal capabilities combined with effective benchmarking strategies are shaping the future of AI-driven task automation.
I’m currently the Chief Evangelist @ Kore.ai. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.