AWS lets agents drive its virtual cloudy desktops - which could cost 500,00 tokens per click

Vendor benchmark finds APIs let you do the job faster and cheaper

by · The Register

Amazon Web Services has let AI agents loose in its cloudy WorkSpaces virtual PCs.

The new service, currently in preview, allows users to assign agents an identity using Amazon’s Identity and Access Management service. Using those credentials, agents can access a WorkSpace at a unique pre-signed URL and drive any apps running there on the cloudy PC.

An AWS spokesperson told us the cloudy colossus recommends developers give each agent a unique identity, because doing so makes it easier to track their activities and to distinguish agentic actions from activity conducted by humans.

We’re also told that agents “connect through a managed MCP endpoint that provides governed access to desktop tools such as screenshots, mouse control, and text input.” This apparently “gives developers a controlled interface for agents to interact with the desktop while maintaining guardrails around what actions they can take.”

The main reason to give an agent its own PC is so it can automatically use software to perform various tasks. Cloudy or virtual PCs are well-suited to this scenario because they can be ephemeral – you can run them long enough for an agent to accomplish a chore, then shut them down. Keeping agents in an isolated virtual private cloud may also be preferable to letting them loose on the LAN or in the datacenter. Organizations that rely entirely on physical PCs, or don’t fancy letting agents drive VMs on a local machine, may also prefer cloudy PCs to the complexity of setting up on-prem virtual PCs.

AWS will allow agentic access to any of the many instance types its WorkSpaces service offers – and they run from small instances that offer a single virtual CPU and 2GB of RAM all the way up to big boppers that pack a GPU, 32 vCPUs, and 256GB of RAM. Amazon rents all its WorkSpaces for either a monthly flat fee that allows non-stop access, or a smaller fee plus hourly access charges.

Amazon is not alone in letting agents drive cloudy PCs: Microsoft has created a version of its Windows 365 service just for agents.

Agents drive PCs using computer vision – they typically take screenshots or video of a desktop, interpret what they “see” and then take action, assuming they’ve been given permission to click, type, and scroll.

AI coding outfit Reflex thinks the work required to do so is non-trivial. The company recently published research that claims a browser-use vision agent needed half a million tokens to click on a dropdown menu and concluded that using an agent can be 45 times more expensive than using an API.

The company has published its benchmark tools on GitHub so you can test its approach to see if you get the same results.

In its blog, Reflex’s head of growth Palash Awasthi allows that better AI models will eventually lower costs. But he insists that using agents will always require more steps to complete a job than APIs.

So maybe check that out before rushing to rent a cloudy desktop? ®