Comparing Human Computer Use With AI Agents
Claude AI Agent Computer Interface: Progress, Human Skill Levels & Potential
The ๐๐น๐ฎ๐๐ฑ๐ฒ ๐๐ ๐๐ด๐ฒ๐ป๐ ๐๐ผ๐บ๐ฝ๐๐๐ฒ๐ฟ ๐๐ป๐๐ฒ๐ฟ๐ณ๐ฎ๐ฐ๐ฒ (๐๐๐) currently performs about 80% ๐ธ๐ฐ๐ณ๐ด๐ฆ than humans at using computers through a graphical user interface (GUI).
๐๐๐บ๐ฎ๐ป๐ typically achieve a proficiency level of ๐ณ๐ฌ-๐ณ๐ฑ%, while the ๐๐น๐ฎ๐๐ฑ๐ฒ ๐๐๐ ๐ณ๐ฟ๐ฎ๐บ๐ฒ๐๐ผ๐ฟ๐ธ scored ๐ญ๐ฐ.๐ต% on the OSWorld benchmark, a test specifically designed to evaluate modelsโ ability to interact with computers.
๐๐๐ ๐ช๐ด ๐ด๐ต๐ช๐ญ๐ญ ๐ช๐ฏ ๐ช๐ต๐ด ๐ฆ๐ข๐ณ๐ญ๐บ ๐ด๐ต๐ข๐จ๐ฆ๐ด, ๐ข๐ฏ๐ฅ ๐ธ๐ฉ๐ช๐ญ๐ฆ ๐ต๐ฉ๐ฆ ๐๐ญ๐ข๐ถ๐ฅ๐ฆ ๐๐๐ ๐ง๐ณ๐ข๐ฎ๐ฆ๐ธ๐ฐ๐ณ๐ฌ ๐ต๐ณ๐ข๐ช๐ญ๐ด ๐ฉ๐ถ๐ฎ๐ข๐ฏ ๐ฑ๐ณ๐ฐ๐ง๐ช๐ค๐ช๐ฆ๐ฏ๐ค๐บ ๐ง๐ฐ๐ณ ๐ฏ๐ฐ๐ธ, ๐ช๐ต๐ด ๐ณ๐ข๐ฑ๐ช๐ฅ ๐ข๐ฅ๐ท๐ข๐ฏ๐ค๐ฆ๐ฎ๐ฆ๐ฏ๐ต ๐ด๐ช๐จ๐ฏ๐ข๐ญ๐ด ๐ข ๐ต๐ช๐ฑ๐ฑ๐ช๐ฏ๐จ ๐ฑ๐ฐ๐ช๐ฏ๐ต ๐ฐ๐ฏ ๐ต๐ฉ๐ฆ ๐ฉ๐ฐ๐ณ๐ช๐ป๐ฐ๐ฏ ๐ธ๐ฉ๐ฆ๐ณ๐ฆ ๐๐ ๐ฎ๐ข๐บ ๐ข๐จ๐ข๐ช๐ฏ ๐ฐ๐ท๐ฆ๐ณ๐ต๐ข๐ฌ๐ฆ ๐ฉ๐ถ๐ฎ๐ข๐ฏ ๐ค๐ข๐ฑ๐ข๐ฃ๐ช๐ญ๐ช๐ต๐ช๐ฆ๐ด ๐ช๐ฏ ๐ต๐ฉ๐ช๐ด ๐ฅ๐ฐ๐ฎ๐ข๐ช๐ฏ.
Despite this gap, Claude outpaces the next-best AI, which scored just 7.7%, solidifying its position as the state-of-the-art in this emerging domain.
This trajectory of progress is reminiscent of other AI milestones, such as Speech Recognition (ASR), Chess, and Sentiment Analysis, where AI initially lagged behind human performance but eventually surpassed it.
Chief Evangelist @ Kore.ai | Iโm passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.