The Cobweb

The Wayback Machine is humongous, and getting humongouser. You can’t search it the way you can search the Web, because it’s too big and what’s in there isn’t sorted, or indexed, or catalogued in any of the many ways in which a paper archive is organized; it’s not ordered in any way at all, except by URL and by date. To use it, all you can do is type in a URL, and choose the date for it that you’d like to look at. It’s more like a phone book than like an archive. Also, it’s riddled with errors. One kind is created when the dead Web grabs content from the live Web, sometimes because Web archives often crawl different parts of the same page at different times: text in one year, photographs in another. In October, 2012, if you asked the Wayback Machine to show you what cnn.com looked like on September 3, 2008, it would have shown you a page featuring stories about the 2008 McCain-Obama Presidential race, but the advertisement alongside it would have been for the 2012 Romney-Obama debate. Another problem is that there is no equivalent to what, in a physical archive, is a perfect provenance. Last July, when the computer scientist Michael Nelson tweeted the archived screenshots of Strelkov’s page, a man in St. Petersburg tweeted back, “Yep. Perfect tool to produce ‘evidence’ of any kind.” Kahle is careful on this point. “We can say, ‘This is what we know. This is what our records say. This is how we received this information, from which apparent Web site, at this IP address.’ That this happened in the past is something that we can’t say, in an ontological way.” Nevertheless, screenshots from Web archives have held up in court, repeatedly. And, as Kahle points out, “They turn out to be much more trustworthy than most of what people try to base court decisions on.”

Leave a comment