NIST tried to pull the pin on NTP servers after blackout caused atomic clock drift
A rare case of deliberately trying to induce an outage
by Simon Sharwood · The RegisterA staffer at the USA’s National Institute of Standards and Technology (NIST) tried to disable backup generators powering some of its Network Time Protocol infrastructure, after a power outage around Boulder, Colorado, led to errors.
As explained in a mailing list post by Jeffrey Sherman, a NIST supervisory physicist who maintains the institute’s atomic clocks, “The atomic ensemble time scale at our Boulder campus has failed due to a prolonged utility power outage.”
Sherman, whose LinkedIn bio proclaims he is “One of the few federal employee actually paid to watch the clocks all day,” says one impact of the incident “is that the Boulder Internet Time Services no longer have an accurate time reference.”
That’s bad because one of the things NIST uses its atomic clocks for is to provide a Network Time Protocol service, the authoritative source of timing information that the computing world relies on so that diverse systems can synchronize events. If NTP isn’t working, outcomes can include difficulties authenticating between systems, meaning applications can become unstable.
At this point, readers might wonder why NIST can’t just turn off the inaccurate service. Sherman said a backup generator kicked in and kept the servers running.
“I will attempt to disable them [the generators] to avoid disseminating incorrect time,” he wrote.
But the storms that caused the outage were so severe, only emergency services personnel are allowed to visit the site.
His post says he has seen “strong evidence one of the crucial generators has failed. In the downstream path is the primary signal distribution chain, including to the Boulder Internet Time Service.”
“Another campus building houses additional clocks backed up by a different power generator; if these survive it will allow us to re-align the primary time scale when site stability returns without making use of external clocks or reference signals,” he added.
Xcel Energy, the local utility, blamed the outage on strong winds and as of Saturday night local time (7:00PM MT Dec 20, 2:00AM UTC Dec 21) advised most customers would have power again within three hours.
However at the time of writing – 00:15 MT Dec 21 – NIST’s status page states the Boulder site is experiencing “Facility outages” and a “< 4.8us clock error.” That’s about four microseconds.
NIST told CBS News it warned users such as telcos and aerospace organizations that weather around Boulder could cause problems, and advised them to tap the org’s other sources of time information.
That’s sound advice as best practice for using NTP is to specify multiple servers, and failover from troubled sources of time info to accurate ones. This incident therefore shouldn’t trouble the prudent, but may leave some NTP feed users unawares if they rely solely on the Boulder facility’s time feeds. ®