Comparing Human Computer Use With AI Agents

Claude AI Agent Computer Interface: Progress, Human Skill Levels & Potential

2 min readDec 10, 2024

The 𝗖𝗹𝗮𝘂𝗱𝗲 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗜𝗻𝘁𝗲𝗿𝗳𝗮𝗰𝗲 (𝗔𝗖𝗜) currently performs about 80% 𝘸𝘰𝘳𝘴𝘦 than humans at using computers through a graphical user interface (GUI).

𝗛𝘂𝗺𝗮𝗻𝘀 typically achieve a proficiency level of 𝟳𝟬-𝟳𝟱%, while the 𝗖𝗹𝗮𝘂𝗱𝗲 𝗔𝗖𝗜 𝗳𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 scored 𝟭𝟰.𝟵% on the OSWorld benchmark, a test specifically designed to evaluate models’ ability to interact with computers.

𝘈𝘊𝘐 𝘪𝘴 𝘴𝘵𝘪𝘭𝘭 𝘪𝘯 𝘪𝘵𝘴 𝘦𝘢𝘳𝘭𝘺 𝘴𝘵𝘢𝘨𝘦𝘴, 𝘢𝘯𝘥 𝘸𝘩𝘪𝘭𝘦 𝘵𝘩𝘦 𝘊𝘭𝘢𝘶𝘥𝘦 𝘈𝘊𝘐 𝘧𝘳𝘢𝘮𝘦𝘸𝘰𝘳𝘬 𝘵𝘳𝘢𝘪𝘭𝘴 𝘩𝘶𝘮𝘢𝘯 𝘱𝘳𝘰𝘧𝘪𝘤𝘪𝘦𝘯𝘤𝘺 𝘧𝘰𝘳 𝘯𝘰𝘸, 𝘪𝘵𝘴 𝘳𝘢𝘱𝘪𝘥 𝘢𝘥𝘷𝘢𝘯𝘤𝘦𝘮𝘦𝘯𝘵 𝘴𝘪𝘨𝘯𝘢𝘭𝘴 𝘢 𝘵𝘪𝘱𝘱𝘪𝘯𝘨 𝘱𝘰𝘪𝘯𝘵 𝘰𝘯 𝘵𝘩𝘦 𝘩𝘰𝘳𝘪𝘻𝘰𝘯 𝘸𝘩𝘦𝘳𝘦 𝘈𝘐 𝘮𝘢𝘺 𝘢𝘨𝘢𝘪𝘯 𝘰𝘷𝘦𝘳𝘵𝘢𝘬𝘦 𝘩𝘶𝘮𝘢𝘯 𝘤𝘢𝘱𝘢𝘣𝘪𝘭𝘪𝘵𝘪𝘦𝘴 𝘪𝘯 𝘵𝘩𝘪𝘴 𝘥𝘰𝘮𝘢𝘪𝘯.

Despite this gap, Claude outpaces the next-best AI, which scored just 7.7%, solidifying its position as the state-of-the-art in this emerging domain.

This trajectory of progress is reminiscent of other AI milestones, such as Speech Recognition (ASR), Chess, and Sentiment Analysis, where AI initially lagged behind human performance but eventually surpassed it.

Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.

Comparing Human Computer Use With AI Agents

Claude AI Agent Computer Interface: Progress, Human Skill Levels & Potential

COBUS GREYLING

Where AI Meets Language | Language Models, AI Agents, Agentic Applications, Development Frameworks & Data-Centric…

Written by Cobus Greyling

No responses yet