Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

New research shows how many important links on the web get lost to time

The Verge [unpaywalled]: “A quarter of the deep links in The New York Times’ articles are now rotten, leading to completely inaccessible pages, according to a team of researchers from Harvard Law School, who worked with the Times’ digital team. They found that this problem affected over half of the articles containing links in the NYT’s catalog going back to 1996, illustrating the problem of link rot and how difficult it is for context to survive on the web. The study looked at over 550,000 articles, which contained over 2.2 million links to external websites. It found that 72 percent of those links were “deep,” or pointing to a specific page rather than a general website. Predictably, it found that, as time went on, links were more likely to be dead: 6 percent of links in 2018 articles were inaccessible, while a whopping 72 percent of links from 1998 were dead. For a recent, widespread example of link rot in practice, just look at what happened when Twitter banned Donald Trump: all of the articles that were embedded in his tweets were littered with gray boxes. The team chose The New York Times in part because the paper is known for its archiving practices, but it’s not suggesting the Times is all that unusual in its link rot problems. Rather, it’s using the paper of record as an example of a phenomenon that happens all across the internet. As time goes by, the websites that once provided valuable insight, important context, or proof of contentious claims through links will be bought and sold, or simply just stop existing, leaving the link to lead to an empty page — or worse…”

Sorry, comments are closed for this post.