OpenAI rolls out GPT-5.4 model focused on automating complex office tasks
OpenAI has introduced GPT-5.4, a new AI model designed to handle complex professional work such as spreadsheets, documents and presentations. The company says the model can also operate a computer on behalf of users across different applications.
by Ankita Garg · India TodayIn Short
- OpenAI's new GPT-5.4 model can control a computer using keyboard and mouse commands
- Its GPT-5.4 Thinking version inside ChatGPT can handle complex queries
- OpenAI claims the model is 33% less likely to produce incorrect claims than GPT-5.2
AI tools are gradually moving beyond answering questions or generating text. Technology companies are now trying to build systems that can carry out tasks on a computer in the same way a person would. With this goal in mind, OpenAI has introduced GPT-5.4, a new AI model that the company says is designed to handle complex professional work involving spreadsheets, documents and presentations.
The company says GPT-5.4 combines improvements in reasoning, coding and tasks that are common in office work. The model is also the first from OpenAI to come with native computer-use capabilities. This allows the system to operate a computer and perform actions across different applications on behalf of a user.
According to the company, GPT-5.4 can generate code that helps it control a computer. It can send keyboard and mouse commands and respond to screenshots to understand what is happening on the screen. This allows the AI system to identify what needs to be done next and perform tasks across different software tools.
The release comes as AI companies are working on systems that can act like digital agents capable of completing tasks online or inside applications. Over the past year, several such tools have appeared, including systems that can take control of a computer to perform tasks such as searching for products online or buying ingredients for a meal.
OpenAI has also introduced a reasoning-focused version of the model called GPT-5.4 Thinking. This version is being rolled out inside ChatGPT. The company says the model is designed to handle queries that require multiple steps before arriving at an answer.
Inside ChatGPT, GPT-5.4 Thinking can show an outline of how it is approaching a complex task. Users will also be able to modify their request while the model is generating a response instead of restarting the process.
“This makes it easier to guide the model toward the exact outcome you want without starting over or requiring multiple additional turns,” OpenAI says.
OpenAI also says the model performs better when dealing with questions that require collecting information from different sources. According to the company, the model “can more persistently search across multiple rounds to identify the most relevant sources, particularly for ‘needle-in-a-haystack’ questions, and synthesize them into a clear, well-reasoned answer.”
The company claims GPT-5.4 improves factual reliability as well. OpenAI describes it as its “most factual model yet,” saying individual claims produced by the system are 33 per cent less likely to be incorrect compared with GPT-5.2.
The model is also said to perform better when working with web browsers and external tools. OpenAI says GPT-5.4 can call tools and APIs more accurately and efficiently while completing tasks.
The rollout of GPT-5.4 is taking place across several OpenAI products. Developers will be able to access the model through the company’s API and through its AI-powered coding tool Codex.
The GPT-5.4 Thinking model is being introduced to ChatGPT for users subscribed to Plus, Team and Pro plans. OpenAI is also releasing GPT-5.4 Pro, which the company says is intended for maximum performance while handling complex tasks. This version will be available through the API and for ChatGPT Enterprise and Edu users.
At the moment, GPT-5.4 Thinking is available in the ChatGPT web app and on Android devices. OpenAI says support for the iOS version of the app will arrive soon.
The new model shows how AI tools are increasingly being designed to perform tasks inside software rather than simply generating responses to prompts.
- Ends