Comparing Human Computer Use With AI Agents

Claude AI Agent Computer Interface: Progress, Human Skill Levels & Potential

Cobus Greyling
2 min readDec 10, 2024

--

The ๐—–๐—น๐—ฎ๐˜‚๐—ฑ๐—ฒ ๐—”๐—œ ๐—”๐—ด๐—ฒ๐—ป๐˜ ๐—–๐—ผ๐—บ๐—ฝ๐˜‚๐˜๐—ฒ๐—ฟ ๐—œ๐—ป๐˜๐—ฒ๐—ฟ๐—ณ๐—ฎ๐—ฐ๐—ฒ (๐—”๐—–๐—œ) currently performs about 80% ๐˜ธ๐˜ฐ๐˜ณ๐˜ด๐˜ฆ than humans at using computers through a graphical user interface (GUI).

๐—›๐˜‚๐—บ๐—ฎ๐—ป๐˜€ typically achieve a proficiency level of ๐Ÿณ๐Ÿฌ-๐Ÿณ๐Ÿฑ%, while the ๐—–๐—น๐—ฎ๐˜‚๐—ฑ๐—ฒ ๐—”๐—–๐—œ ๐—ณ๐—ฟ๐—ฎ๐—บ๐—ฒ๐˜„๐—ผ๐—ฟ๐—ธ scored ๐Ÿญ๐Ÿฐ.๐Ÿต% on the OSWorld benchmark, a test specifically designed to evaluate modelsโ€™ ability to interact with computers.

๐˜ˆ๐˜Š๐˜ ๐˜ช๐˜ด ๐˜ด๐˜ต๐˜ช๐˜ญ๐˜ญ ๐˜ช๐˜ฏ ๐˜ช๐˜ต๐˜ด ๐˜ฆ๐˜ข๐˜ณ๐˜ญ๐˜บ ๐˜ด๐˜ต๐˜ข๐˜จ๐˜ฆ๐˜ด, ๐˜ข๐˜ฏ๐˜ฅ ๐˜ธ๐˜ฉ๐˜ช๐˜ญ๐˜ฆ ๐˜ต๐˜ฉ๐˜ฆ ๐˜Š๐˜ญ๐˜ข๐˜ถ๐˜ฅ๐˜ฆ ๐˜ˆ๐˜Š๐˜ ๐˜ง๐˜ณ๐˜ข๐˜ฎ๐˜ฆ๐˜ธ๐˜ฐ๐˜ณ๐˜ฌ ๐˜ต๐˜ณ๐˜ข๐˜ช๐˜ญ๐˜ด ๐˜ฉ๐˜ถ๐˜ฎ๐˜ข๐˜ฏ ๐˜ฑ๐˜ณ๐˜ฐ๐˜ง๐˜ช๐˜ค๐˜ช๐˜ฆ๐˜ฏ๐˜ค๐˜บ ๐˜ง๐˜ฐ๐˜ณ ๐˜ฏ๐˜ฐ๐˜ธ, ๐˜ช๐˜ต๐˜ด ๐˜ณ๐˜ข๐˜ฑ๐˜ช๐˜ฅ ๐˜ข๐˜ฅ๐˜ท๐˜ข๐˜ฏ๐˜ค๐˜ฆ๐˜ฎ๐˜ฆ๐˜ฏ๐˜ต ๐˜ด๐˜ช๐˜จ๐˜ฏ๐˜ข๐˜ญ๐˜ด ๐˜ข ๐˜ต๐˜ช๐˜ฑ๐˜ฑ๐˜ช๐˜ฏ๐˜จ ๐˜ฑ๐˜ฐ๐˜ช๐˜ฏ๐˜ต ๐˜ฐ๐˜ฏ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฉ๐˜ฐ๐˜ณ๐˜ช๐˜ป๐˜ฐ๐˜ฏ ๐˜ธ๐˜ฉ๐˜ฆ๐˜ณ๐˜ฆ ๐˜ˆ๐˜ ๐˜ฎ๐˜ข๐˜บ ๐˜ข๐˜จ๐˜ข๐˜ช๐˜ฏ ๐˜ฐ๐˜ท๐˜ฆ๐˜ณ๐˜ต๐˜ข๐˜ฌ๐˜ฆ ๐˜ฉ๐˜ถ๐˜ฎ๐˜ข๐˜ฏ ๐˜ค๐˜ข๐˜ฑ๐˜ข๐˜ฃ๐˜ช๐˜ญ๐˜ช๐˜ต๐˜ช๐˜ฆ๐˜ด ๐˜ช๐˜ฏ ๐˜ต๐˜ฉ๐˜ช๐˜ด ๐˜ฅ๐˜ฐ๐˜ฎ๐˜ข๐˜ช๐˜ฏ.

Despite this gap, Claude outpaces the next-best AI, which scored just 7.7%, solidifying its position as the state-of-the-art in this emerging domain.

This trajectory of progress is reminiscent of other AI milestones, such as Speech Recognition (ASR), Chess, and Sentiment Analysis, where AI initially lagged behind human performance but eventually surpassed it.

Chief Evangelist @ Kore.ai | Iโ€™m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.

--

--

Cobus Greyling
Cobus Greyling

Written by Cobus Greyling

Iโ€™m passionate about exploring the intersection of AI & language. www.cobusgreyling.com

No responses yet