Sitemap

AgentS2 Is An Open-Source Computer Use AI Agent

Framework for Computer Use Agents which you can install & run on your PC

7 min readMay 7, 2025

--

Introduction

I find it intriguing that there is still doubt regarding the power and potential of AI Agents. The best knowledge is firsthand knowledge and personal experience.

In a previous Medium post I gave some background on the AgentS2 framework. In this article I want to do a walkthrough on how to install the this AI Agent for Computer Use on your local machine and run it.

I asked the AI Agent to play Music on my MacBook, what is the current weather in Cape Town. And what is the square root of the year of birth of the man commonly regarded as the father of the iPone

All of these questions the were answered successfully by the AI Agent.

I would say a definite bottleneck is the screen capture process and processing the image.

The boundary between human intention & computer execution will become increasingly seamless.

Sequence of Events

Consider the image below…it shows the sequence of events followed by the AI Agent. What is interesting is that it looks for similar task experiences from the past.

A high level plan is the generated followed by a graph data document…this part I find fascinating how the AI Agent creates a graph data flow which is really disposable.

The AI Agent also considers and balances successfully completed subtasks and future remaining subtasks.

How we interact with technology is changing while preserving human agency through transparent reasoning processes.

Installing AgentS2 on MacOs

I opted to make use of Conda to create and manage the virtual environment…the reason being that with Conda when you create the virtual environment you can define the version of Python you want to use.

This is convenient when a certain Python version is demanded by the installation.

I first downloaded Conda…

curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh

Then Installed it…

Miniconda3-latest-MacOSX-arm64.sh

Below I create a virtual environment named agents2

conda create -n agents2 python=3.10

This command lists all the existing virtual environments created…

conda info — envs

Now we enter the virtual environment….

conda activate /Users/cobusgreyling/opt/anaconda3/envs/agents2

Clone the AgentS2 repository to the virtual environment…

git clone https://github.com/simular-ai/Agent-S.git

And run the installation…

pip install -r requirements.txt

Running the AI Agent

Next we setup our LLM API key by exporting it as an environment variable for the application to make use of…

 export ANTHROPIC_API_KEY="Your Anthropic API Key"

And we run and enter the AI Agent’s command line by typing agent_s2 and pressing enter…

Running The Agent

In the MacOs Terminal window I post the question:

What is the current weather in Cape Town?

From the image below, notice the planning, looking for similar task experiences. And the initial plan to open a browser, navigate to a weather website, and more…

Below the JSON formatted graph data…

And here is a basic graphic representation of the graph data…it is interesting how we as humans use to hardcode flows as graph data with nodes and decision points.

These different flows then had to be categorised according to intent and selected for the correct flow.

Now with AI Agents, compound multi-intent user utterances can be put to the AI Agent, and it creates a complex flow of tasks to fulfil the request.

And added to this, it selects available tools within the users GUI to complete the tasks.

The terminal below shows how the AI Agent is analysing the screenshots taken when navigating the web…

The AI Agent performs internal reasoning…

Based on the current desktop state, I can see that the task 
has been completed. The Google search results are already
displaying the current weather information for Cape Town.
The weather information shows:

- Current temperature: 16°C (61°F)
- Weather condition: Clear
- Current time in Cape Town: Tuesday 19:00
- Humidity: 89%
- Wind: 14 km/h
- Precipitation: 0%

The weather widget also displays the forecast for the coming
days.
No further steps are needed as the task of finding the
current weather in Cape Town has been successfully completed.
The information is clearly visible in the Google weather
widget on the screen.

My Litmus Test For AI Agents

My test for AI Agents is to see if they are able to answer this question:

What is the square root of the year of birth of the man who is commonly regarded as the father of the iPone?

In order for an AI Agent to be able to answer this question, it needs access to two tools:

  1. Internet search to get the right person and the year of birth,
  2. and a math tool to calculate the square root of the year of birth.

The AI Agent should be able to use its internal reasoning capabilities to re-order the question in the logical sequence and flow. Something it is not currently in.

An interesting observation, where the AI Agent does not have access to a math library, it uses to internet to calculate the square root. Which is an interesting observation.

Also, I saw the AI Agent get Steve Jobs as the father of the iPhone from the Anthropic LLM, but then confirm this via the web.

Here are some of the internal reasoning from the AI Agent:

Based on the current state of the screen, I can see that 
the task is already partially addressed. The terminal shows
the question "What is the square root of the year of birth
of the man who is commonly regarded as the father of the
iPhone?" and I can also see that a browser tab with
"Cape Town weather" is already open.

And then the AI Agent comes up with a sequence of events:

To complete this task:
1. Determine who is commonly regarded as the father of the
iPhone (Steve Jobs)

2. Find Steve Jobs' year of birth (1955)

3. Calculate the square root of 1955

Here's my plan:
1. Click on the browser tab that's already open

2. Click on the address bar at the top of the browser

3. Clear the current search and type "Steve Jobs year of
birth"

4. Press Enter to search

5. Review the search results to confirm Steve Jobs'
birth year (1955)

6. Clear the address bar again

7. Type "square root of 1955" in the search bar

8. Press Enter to search

9. Read the result (approximately 44.22)

The Future of AI Agents and Computer Interaction

Internal reasoning is the key differentiator for effective AI Agents that automate computer use…

AgentS2’s implementation of tree-of-thought reasoning enables it to consider multiple paths before action

The ability to self-critique and revise approaches leads to more robust task completion

Internal monologues provide transparency into the agent’s decision-making process, making errors more traceable and fixable

There is full observability, inspectability and discoverability via the fine-grained output from the AI Agent…

Graph-based flows enhance agent reliability

Task decomposition into nodes with clear dependencies creates verifiable execution paths

Decision points in the graph allow for branching logic based on environmental feedback

Cyclical evaluation loops enable continuous improvement

Plan formation with initial graph construction

Execution attempt and traversing the graph

Outcome evaluation with success/failure of node objectives

Refinement by graph restructuring based on results

In Conclusion

As AI agents like Agents2 continue to evolve, the boundary between human intention and computer execution will become increasingly seamless, fundamentally changing how we interact with technology while preserving human agency through transparent reasoning processes.

Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.

--

--

Cobus Greyling
Cobus Greyling

Written by Cobus Greyling

I’m passionate about exploring the intersection of AI & language. www.cobusgreyling.com

No responses yet