DH Read: “Why We Need To Archive The Web In Order To Preserve Twitter”

I’ve been concerned about how much of what we see on the internet today will still be accessible to future historians of the 21st century ever since I read an article on the digital dark age for an independent study in college (fittingly, I can’t find it now). The article argued that although it might seem like historians studying the information age will have a wealth of websites, data, and digital media to use as sources, much of this information is actually in danger of disappearing or becoming unreadable and not enough is being done to preserve it. I am certain that strides have been made since that article came out, but I still don’t think that it is alarmist to treat the digital dark age as a real possibility. A recent article on Forbes by Kalev Leetaru reveals a particular danger that I had not even considered: short links on Twitter. The Library of Congress archives all public tweets, but if a shortened link no longer works, how much will anyone really be able to understand about the tweet? From the article:

As part of a series of projects looking at how social media interacts with civil discourse and the public information environment in conflict environments, I’ve been spending a fair amount of time pouring over historical Twitter data looking at who said what when in the leadup to major conflicts. However, in doing so I’ve noticed that often a tweet will contain only a link or will say something along the lines of “This. http://xyz” or “Join us this afternoon – see http://xyz.” Yet, when I followed the links in many of these tweets my path ended in 404 Not Found error or an error saying the short link had expired, been deleted, the bespoke shortening service no longer existed, or, on some occasions, an error message that the short link had been disabled due to pointing to content that violated the service’s terms of use.

This raises the question – if we archive every single public tweet that crosses Twitter, are we truly preserving Twitter?

All of the issues with fully preserving these links are much more complicated than I could attempt to summarize. The article has a lot of acronyms and jargon, but it’s still worth reading for anyone concerned about the future of history. It’s backed up with some frightening statistics on the state of link preservation and points out the bias in favor of links sent by users in European languages, which, as Leetaru says, “mirrors that of web archives as a whole and reinforces the critical need for our major web archiving initiatives to move beyond their English language Western roots and look more holistically at the global web.” In my opinion, this is a crucial step for decolonizing DH and the internet at large.

Read the full article here.