Adversarial ๐—”๐˜๐˜๐—ฎ๐—ฐ๐—ธ๐˜€ On ๐—”๐—œ ๐—”๐—ด๐—ฒ๐—ป๐˜ ๐—–๐—ผ๐—บ๐—ฝ๐˜‚๐˜๐—ฒ๐—ฟ ๐—œ๐—ป๐˜๐—ฒ๐—ฟ๐—ณ๐—ฎ๐—ฐ๐—ฒ๐˜€ (๐—”๐—–๐—œ) ๐˜ƒ๐—ถ๐—ฎ ๐—ฃ๐—ผ๐—ฝ-๐—จ๐—ฝ๐˜€

๐˜ˆ๐˜จ๐˜ฆ๐˜ฏ๐˜ต๐˜ช๐˜ค ๐˜ˆ๐˜ฑ๐˜ฑ๐˜ญ๐˜ช๐˜ค๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด ๐˜ช๐˜ฏ๐˜ต๐˜ฆ๐˜ณ๐˜ข๐˜ค๐˜ต๐˜ช๐˜ฏ๐˜จ ๐˜ธ๐˜ช๐˜ต๐˜ฉ ๐˜ฐ๐˜ฑ๐˜ฆ๐˜ณ๐˜ข๐˜ต๐˜ช๐˜ฏ๐˜จ ๐˜ด๐˜บ๐˜ด๐˜ต๐˜ฆ๐˜ฎ๐˜ด ๐˜ท๐˜ช๐˜ข ๐˜ต๐˜ฉ๐˜ฆ ๐˜Ž๐˜œ๐˜ ๐˜ช๐˜ด ๐˜ด๐˜ถ๐˜ค๐˜ฉ ๐˜ข ๐˜ฏ๐˜ฆ๐˜ธ ๐˜ค๐˜ฐ๐˜ฏ๐˜ค๐˜ฆ๐˜ฑ๐˜ต ๐˜ข๐˜ฏ๐˜ฅ ๐˜บ๐˜ฆ๐˜ต ๐˜ข ๐˜ณ๐˜ฆ๐˜ค๐˜ฆ๐˜ฏ๐˜ต ๐˜ด๐˜ต๐˜ถ๐˜ฅ๐˜บ ๐˜ช๐˜ด ๐˜ค๐˜ฐ๐˜ฏ๐˜ด๐˜ช๐˜ฅ๐˜ฆ๐˜ณ๐˜ช๐˜ฏ๐˜จ ๐˜ธ๐˜ข๐˜บ๐˜ด ๐˜ช๐˜ฏ ๐˜ธ๐˜ฉ๐˜ช๐˜ค๐˜ฉ ๐˜ด๐˜ถ๐˜ค๐˜ฉ ๐˜ด๐˜บ๐˜ด๐˜ต๐˜ฆ๐˜ฎ๐˜ด ๐˜ค๐˜ข๐˜ฏ ๐˜ฃ๐˜ฆ ๐˜ข๐˜ต๐˜ต๐˜ข๐˜ค๐˜ฌ๐˜ฆ๐˜ฅ.

4 min readNov 7, 2024

--

As developers increasingly use AI agents for automating tasks, understanding potential vulnerabilities becomes criticalโ€ฆ

This study reveals that introducing adversarial pop-ups into Agentic Environments achieves an attack success rate of 86%, reducing task completion rates by 47%.

Basic defences, like instructing agents to ignore pop-ups, proved ineffective, underscoring the need for more robust protection mechanisms against such attacks.

In Short

Just like human users need to undergo training to recognise phishing emails, so Language Models with vision capabilities need to be trained to recognise adversarial attacks via the GUI.

Language Models and agentic applications will need to undergo a similar process to ignore environmental noises and recognise prioritise legitimate instructions.

This also applies to embodied agents since many distractors in the physical environment might also be absent from the training data.

There is also a case to be made for human users to oversee the automated agent workflow carefully to manage the potential risks from the environment.

The future might see work on effectively leveraging human supervision and intervention for necessary safety concerns.

There are also a case to be made for the user environment and safeguarding the digital world within which the AI Agent lives. If this digital world is a PC Operating System, then popups and undesired web content can be filtered on an OS level.

This, together with Language Model training can serve as a double check.

On average, 92.7% / 73.1% of all actions of attacked agents in OSWorld/VisualWebArena are clicking on the adversarial pop-ups.

Some Background

The paper demonstrates vulnerabilities in Language Models / Foundation Models and AI Agents that interact with graphical user interfaces, showing that these agents can be manipulated by adversarial pop-ups, leading them to click on misleading elements instead of completing intended tasks.

The researchers introduced carefully crafted pop-ups in environments like OSWorld and VisualWebArena. Basic defence strategies, such as instructing the agent to ignore pop-ups, were ineffective, revealing the need for robust protections against such attacks in VLM-based automation.

Below the attack sequence and construct of the pop-up:

Conclusion

For language model based AI Agents to carry out tasks on the web, effective interaction with graphical user interfaces (GUIs) is crucial.

These AI Agents must interpret and respond to interface elements on webpages or screenshots, executing actions like clicking, scrolling, or typing based on commands such as โ€œfind the cheapest productโ€ or โ€œset the default search engine.โ€

Although state-of-the-art Language Models / Foundation Models are improving in performing such tasks, they remain vulnerable to deceptive elements, such as pop-ups, fake download buttons, and misleading countdown timers.

Adversarial pop-up example showing the design space of the pop-up: (1) Attention Hook, (2) Instruction, (3) Info Banner, (4) ALT Descriptor.

While these visual distractions are typically intended to trick human users, they pose similar risks for AI Agents, which can be misled into dangerous actions, like downloading malware or navigating to fraudulent sites.

As AI Agents are perform tasks on behalf of users, addressing these risks becomes essential to protect both agents and end-users from unintended consequences.

Chief Evangelist @ Kore.ai | Iโ€™m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.

--

--

Cobus Greyling
Cobus Greyling

Written by Cobus Greyling

Iโ€™m passionate about exploring the intersection of AI & language. www.cobusgreyling.com

Responses (1)