The Age of the All-Access AI Agent Is Here

by · WIRED

Save StorySave this story
Save StorySave this story

For years, the cost of using “free” services from Google, Facebook, Microsoft, and other Big Tech firms has been handing over your data. Uploading your life into the cloud and using free tech brings conveniences, but it puts personal information in the hands of giant corporations that will often be looking to monetize it. Now, the next wave of generative AI systems are likely to want more access to your data than ever before.

Over the past two years, generative AI tools—such as OpenAI’s ChatGPT and Google’s Gemini—have moved beyond the relatively straightforward, text-only chatbots that the companies initially released. Instead, Big AI is increasingly building and pushing toward the adoption of agents and “assistants” that promise they can take actions and complete tasks on your behalf. The problem? To get the most out of them, you’ll need to grant them access to your systems and data. While much of the initial controversy over large language models (LLMs) was the flagrant copying of copyrighted data online, AI agents’ access to your personal data will likely cause a new host of problems.

“AI agents, in order to have their full functionality, in order to be able to access applications, often need to access the operating system or the OS level of the device on which you’re running them,” says Harry Farmer, a senior researcher at the Ada Lovelace Institute, whose work has included studying the impact of AI assistants and found that they may cause “profound threat” to cybersecurity and privacy. For personalization of chatbots or assistants, Farmer says, there can be data trade-offs. “All those things, in order to work, need quite a lot of information about you,” he says.

While there’s no strict definition of what an AI agent actually is, they’re often best thought of as a generative AI system or LLM that has been given some level of autonomy. At the moment, agents or assistants, including AI web browsers, can take control of your device and browse the web for you, booking flights, conducting research, or adding items to shopping carts. Some can complete tasks that include dozens of individual steps.

While current AI agents are glitchy and often can’t complete the tasks they’ve been set out to do, tech companies are betting the systems will fundamentally change millions of people’s jobs as they become more capable. A key part of their utility likely comes from access to data. So, if you want a system that can provide you with your schedule and tasks, it’ll need access to your calendar, messages, emails, and more.

Some more advanced AI products and features provide a glimpse into how much access agents and systems could be given. Certain agents being developed for businesses can read code, emails, databases, Slack messages, files stored in Google Drive, and more. Microsoft’s controversial Recall product takes screenshots of your desktop every few seconds, so that you can search everything you’ve done on your device. Tinder has created an AI feature that can search through photos on your phone “to better understand” users’ “interests and personality.”

Carissa Véliz, an author and associate professor at the University of Oxford, says most of the time consumers have no real way to check if AI or tech companies are handling their data in the ways they claim to. “These companies are very promiscuous with data,” Véliz says. “They have shown to not be very respectful of privacy.”

The modern AI industry has never really been respectful of data rights. After the machine-learning and deep-learning breakthroughs of the early 2010s showed that the systems could produce better results when they are trained on more data, the race to hoover up as much information as possible intensified. Face recognition firms, such as Clearview, scraped millions of photos of people from across the web. Google paid people just $5 for facial scans; official government agencies allegedly used images of exploited children, visa applicants, and dead people to test their systems.

Fast forward a few years, and data-hungry AI firms scraped huge swaths of the web and copied millions of books—often without permission or payment—to build the LLMs and generative AI systems they’re currently expanding into agents. Having exhausted much of the web, many companies made it their default position to train AI systems on user data, making people opt out instead of opt in.

While some privacy-focused AI systems are being developed, and some privacy protections are in place, much of the data processing by agents will take place in the cloud, and data moving from one system to another could cause problems. One study, commissioned by European data regulators, outlined a host of privacy risks linked to agents, including: how sensitive data could be leaked, misused, or intercepted; how systems could transmit sensitive information to external systems without safeguards in place; and how data handling could rub up against privacy regulations.

“Even if, let's say, you genuinely consent and you genuinely are informed about how your data is used, the people with whom you interact might not be consenting,” Véliz, the Oxford associate professor, says. “If the system has access to all of your contacts and your emails and your calendar and you’re calling me and you have my contact, they're accessing my data too, and I don't want them to.”

The behavior of agents can also threaten existing security practices. So-called prompt-injection attacks, where malicious instructions are fed to an LLM in text it reads or ingests, can lead to leaks. And if agents are given deep access to devices, they pose a threat to all data included on them.

“The future of total infiltration and privacy nullification via agents on the operating system is not here yet, but that is what is being pushed by these companies without the ability for developers to opt out,” Meredith Whittaker, the president of the Signal Foundation, which runs the encrypted Signal messaging app, told WIRED earlier this year. Agents that can access everything on your device or operating system pose an “existential threat” to Signal and application-level privacy, Whittaker said. “What we’re calling for is very clear developer-level opt-outs to say, ‘Do not fucking touch us if you’re an agent.’”

For individuals, Farmer from the Ada Lovelace Institute says many people have already built up intense relationships with existing chatbots and may have shared huge volumes of sensitive data with them during the process, making them different from other systems that have come before. “Be very careful about the quid pro quo when it comes to your personal data with these sorts of systems,” Farmer says. “The business model these systems are operating on currently may well not be the business model that they adopt in the future.”