Adversarial 𝗔𝘁𝘁𝗮𝗰𝗸𝘀 On 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗜𝗻𝘁𝗲𝗿𝗳𝗮𝗰𝗲𝘀 (𝗔𝗖𝗜) 𝘃𝗶𝗮 𝗣𝗼𝗽-𝗨𝗽𝘀

𝘈𝘨𝘦𝘯𝘵𝘪𝘤 𝘈𝘱𝘱𝘭𝘪𝘤𝘢𝘵𝘪𝘰𝘯𝘴 𝘪𝘯𝘵𝘦𝘳𝘢𝘤𝘵𝘪𝘯𝘨 𝘸𝘪𝘵𝘩 𝘰𝘱𝘦𝘳𝘢𝘵𝘪𝘯𝘨 𝘴𝘺𝘴𝘵𝘦𝘮𝘴 𝘷𝘪𝘢 𝘵𝘩𝘦 𝘎𝘜𝘐 𝘪𝘴 𝘴𝘶𝘤𝘩 𝘢 𝘯𝘦𝘸 𝘤𝘰𝘯𝘤𝘦𝘱𝘵 𝘢𝘯𝘥 𝘺𝘦𝘵 𝘢 𝘳𝘦𝘤𝘦𝘯𝘵 𝘴𝘵𝘶𝘥𝘺 𝘪𝘴 𝘤𝘰𝘯𝘴𝘪𝘥𝘦𝘳𝘪𝘯𝘨 𝘸𝘢𝘺𝘴 𝘪𝘯 𝘸𝘩𝘪𝘤𝘩 𝘴𝘶𝘤𝘩 𝘴𝘺𝘴𝘵𝘦𝘮𝘴 𝘤𝘢𝘯 𝘣𝘦 𝘢𝘵𝘵𝘢𝘤𝘬𝘦𝘥.

Cobus Greyling

4 min readNov 7, 2024

As developers increasingly use AI agents for automating tasks, understanding potential vulnerabilities becomes critical…

This study reveals that introducing adversarial pop-ups into Agentic Environments achieves an attack success rate of 86%, reducing task completion rates by 47%.

Basic defences, like instructing agents to ignore pop-ups, proved ineffective, underscoring the need for more robust protection mechanisms against such attacks.

In Short

Just like human users need to undergo training to recognise phishing emails, so Language Models with vision capabilities need to be trained to recognise adversarial attacks via the GUI.

Language Models and agentic applications will need to undergo a similar process to ignore environmental noises and recognise prioritise legitimate instructions.

This also applies to embodied agents since many distractors in the physical environment might also be absent from the training data.

There is also a case to be made for human users to oversee the automated agent workflow carefully to manage the potential risks from the environment.

The future might see work on effectively leveraging human supervision and intervention for necessary safety concerns.

There are also a case to be made for the user environment and safeguarding the digital world within which the AI Agent lives. If this digital world is a PC Operating System, then popups and undesired web content can be filtered on an OS level.

This, together with Language Model training can serve as a double check.

On average, 92.7% / 73.1% of all actions of attacked agents in OSWorld/VisualWebArena are clicking on the adversarial pop-ups.

Some Background

The paper demonstrates vulnerabilities in Language Models / Foundation Models and AI Agents that interact with graphical user interfaces, showing that these agents can be manipulated by adversarial pop-ups, leading them to click on misleading elements instead of completing intended tasks.

The researchers introduced carefully crafted pop-ups in environments like OSWorld and VisualWebArena. Basic defence strategies, such as instructing the agent to ignore pop-ups, were ineffective, revealing the need for robust protections against such attacks in VLM-based automation.

Below the attack sequence and construct of the pop-up:

Conclusion

For language model based AI Agents to carry out tasks on the web, effective interaction with graphical user interfaces (GUIs) is crucial.

These AI Agents must interpret and respond to interface elements on webpages or screenshots, executing actions like clicking, scrolling, or typing based on commands such as “find the cheapest product” or “set the default search engine.”

Although state-of-the-art Language Models / Foundation Models are improving in performing such tasks, they remain vulnerable to deceptive elements, such as pop-ups, fake download buttons, and misleading countdown timers.

Adversarial pop-up example showing the design space of the pop-up: (1) Attention Hook, (2) Instruction, (3) Info Banner, (4) ALT Descriptor.

While these visual distractions are typically intended to trick human users, they pose similar risks for AI Agents, which can be misled into dangerous actions, like downloading malware or navigating to fraudulent sites.

As AI Agents are perform tasks on behalf of users, addressing these risks becomes essential to protect both agents and end-users from unintended consequences.

Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.

COBUS GREYLING

Where AI Meets Language | Language Models, AI Agents, Agentic Applications, Development Frameworks & Data-Centric…

cobusgreyling.com

Cobus Greyling - Medium

Read writing from Cobus Greyling on Medium. I'm passionate about exploring the intersection of AI & language…

cobusgreyling.medium.com

Attacking Vision-Language Computer Agents via Pop-ups

Autonomous agents powered by large vision and language models (VLM) have demonstrated significant potential in…

arxiv.org