GOV.UK chatbot gets smarter but slower as LLMs improve
Accuracy jumps from 76% to 90% across public pilots, while users wait nearly 11 seconds for answers
by SA Mathieson · The RegisterMore powerful large language models (LLMs) are helping make the UK government's in-development chatbot more accurate but are also slowing it down, according to the Government Digital Service (GDS).
GDS has run two public pilots of its GOV.UK Chat service, the first on a few pages of the GOV.UK website in late 2024 and the second in the GOV.UK app in autumn 2025. It reckons these show answer accuracy increasing from 76 percent to 90 percent, partly due to advances in LLMs and partly due to its own work on data science.
It had previously run a private pilot of the chatbot in 2023, which it later said did not meet its required levels of accuracy and in a few cases produced outright mistakes.
GDS reckons the chatbot, which only uses material from GOV.UK and includes links to source material, now scores more highly than mass-market AI assistants when answering government-related questions. Recent research by the Open Data Institute tested 11 LLMs with questions on GOV.UK material and found they often waffled, went beyond official information, or made mistakes.
However, the GDS research found that users want answers faster than the service's 10.7-second average response time.
"This year, the latest versions of frontier models have been more powerful but slower than previous versions," write GDS staffers Sam Dub and Sharon McDonald in a GOV.UK blog post. "For us, accuracy is the most important thing, and consequently GOV.UK Chat responses are slower than we'd ideally like."
In response, GDS is considering breaking up answers so the chatbot provides the first part while working on the rest, although Dub and McDonald note this will require substantial work including on safety guardrails.
The public pilots included 508 attempts to fool the service into providing an inappropriate or harmful response, all of which failed, and the system, which uses Amazon's Bedrock platform and Anthropic's Claude models, coped well with demand, according to the blog.
The chatbot can now request clarification when users ask ambiguous questions, rather than refusing to provide an answer, as a result of the pilots. In future, it might also pass queries to specific government departments when users want to speak to someone about their own circumstances.
GDS plans to add the chatbot to its GOV.UK app – something it promised in early 2026 in a December blog post – then work on implementing the service across the vast GOV.UK website later this year. ®