Link Rot

I was able to analyze ~2m externally facing links found in NYT articles since its inception in 1996. We found that 25% of deep links have rotted. If you go back to 1998, 72% of the links are dead. More than 50% of all NYT articles that contain deep links have at least 1 rotted link. The benefits of the internet and web’s flexibility—including permitting the building of walled app gardens on top of them that reject the idea of a URL entirely—now come at great risk and cost to the larger tectonic enterprise to, in Google’s early words, “organize the world’s information and make it universally accessible and useful.” What are we going to do about the crisis we’re in? A complementary approach to “save everything” through independent scraping is for whoever is creating a link to make sure that a copy is saved at the time the link is made. Authors of enduring documents—including scholarly papers, newspaper articles, and judicial opinions—can ask Perma to convert the links included within them into permanent ones archived at perma.cc; participating libraries treat snapshots of what’s found at those links as accessions to their collections, and undertake to preserve them indefinitely. A technical infrastructure through which authors and publishers can preserve the links they draw on is a necessary start. But the problem of digital malleability extends beyond the technical. The law should hesitate before allowing the scope of remedies for claimed infringements of rights—whether economic ones like copyright or more personal, dignitary ones like defamation—to expand naturally as the ease of changing what’s already been published increases.

Leave a comment