Commentary: The internet is deciding what to forget
Which tweets, posts or websites matter to our collective cultural memory? Answering that is proving difficult, says the Financial Times’ Elaine Moore.
by Elaine Moore · CNA · JoinRead a summary of this article on FAST.
Get bite-sized news via a new
cards interface. Give it a try.
Click here to return to FAST Tap here to return to FAST
FAST
LONDON: The internet is so vast and all-consuming that it’s easy to forget how fragile it can be. Do something embarrassing online and there’s a good chance it will live there forever, shared without your consent.
But not everything that’s posted is permanent. The last big study of web pages found that over a third available in 2013 were now inaccessible – leaving a trail of “link rot” in their wake.
Maybe you think this is a good thing. If you’ve ever scrolled back far enough to see your very first Facebook status update you’ll probably wish that link was broken.
Right now there’s a trend for AI-generated videos of Love Island starring cartoon fruit that regularly get millions of views. Do digital bananas in Hawaiian shirts chatting up pineapples need to be saved for posterity? Probably not.
CNA Games
Guess Word
Crack the word, one row at a time
Buzzword
Create words using the given letters
Mini Sudoku
Tiny puzzle, mighty brain teaser
Mini Crossword
Small grid, big challenge
Word Search
Spot as many words as you can
Show More
Show Less
But disentangling what will and will not matter to our collective cultural memory is proving difficult.
ARCHIVE EVERY TWEET?
Efforts to save absolutely everything haven’t gone very well. There’s too much and a lot of it is nonsense.
In 2010, the United States Library of Congress took the view that Twitter was a crucial source of modern history and decided to archive every single tweet. It “may prove to be one of this generation’s most significant legacies to future generations”, the library wrote.
That “may” seems over-optimistic. To most people, the repository is both unwieldy and uninteresting. As of 2017, the library seems to agree. It now opts to save just a few select posts.
The risk in being selective, of course, is missing something important. Dutch consultant Maurice de Kunder has been following the number of web pages indexed by search engines for over a decade and found that it has fallen from 4.7 billion to 3.98 billion.
Some deletions are more deliberate than others. Last year, Elon Musk’s “Department of Government Efficiency” launched a project to eliminate up to 20 per cent of US federal websites. Particular words, such as climate change, also evaporated. A couple of months later, large companies began rewriting their own sites to also remove references to climate change.
The only reason we know this is because third parties were keeping track – the organisations themselves did not flag changes.
THE FATE OF DIGITAL CONTENT
Because online content is regularly overwritten, what the historian Abby Smith Rumsey calls modern memory technology has a significantly shorter lifespan than pre-digital versions.
There is neither a single record of everything posted online nor an agreed-upon way to save it. This has become more noticeable with the death of digital publications.
You can see newspaper editions printed in 1665, the year the Great Plague of London began, but you can no longer visit a modern news site like Wales’s The National, which launched in 2021 and was then taken offline. Some sites, like Gawker, have been archived while others have disappeared into 404 errors (the status code that indicates a server can’t find a webpage).
A few have entered into a strange afterlife. When cult site The Hairpin was shut down in 2018, its domain was purchased by a Serbian entrepreneur called Nebojsa Vujinovic, who specialises in buying old news sites and filling them with AI-generated clickbait. Now it just redirects readers to an online gambling site.
SAVE IT YOURSELF
Despite relying heavily on digital data, we have left its preservation to a mishmash of individual efforts.
The best known is the Wayback Machine, an initiative from the American non-profit Internet Archive. This takes snapshots of websites (it has preserved over 1 trillion so far) but it doesn’t have everything.
Copyright owners can seek content removal and some sites have begun to blacklist the Wayback Machine, suspecting that AI companies are using it as a way to scrape content without permission. A report by the Nieman Lab found that the volume of snapshots dipped in the second half of 2025.
A second popular option is archive.today, a mysterious site operating under multiple domain names. How long it will last is anyone’s guess. Last year the FBI subpoenaed the unknown registrar behind it and Wikipedia recently asked editors to stop linking to it “due to concerns about botnets, linkspamming, and how the site is run”.
There is, of course, a sort of immortality in the fact that much of what exists online has been used to train AI models. But this isn’t much help if you want to trace something’s original form. Even online snapshots of web pages may prove less durable than physical archives.
We treat the internet as if it is limitless and permanent, but transience is inbuilt. If you see something online worth saving, you’d better do it yourself.
Sign up for our newsletters
Get our pick of top stories and thought-provoking articles in your inbox
Get the CNA app
Stay updated with notifications for breaking news and our best stories
Get WhatsApp alerts
Join our channel for the top reads for the day on your preferred chat app