Blast From the Past: The Internet Archive

Category: Reference

The Internet Archive is a nonprofit digital library whose stated mission to provide “universal access to all knowledge.” The Internet Archive provides permanent storage for Web pages, text files, music, videos, images, e-books, and other digitized artifacts. Its entire contents are open to the public, free of charge. It even has a time travel feature. Here's how it works...

What's in the Internet Archive?

Of course, you won’t find “all knowledge” in the Internet Archive. Most digital artifacts are someone’s intellectual property, and they aren’t given away freely. But the Archive contains over 3 petabytes of public domain material and items contributed by supporters under the Creative Commons License. (Just for reference, you’d need three thousand 1-terabyte hard drives to hold all that data.) Most of the material is old, its copyrights having expired. Most of it is of interest to rather limited audiences. But as they say, one man’s junk is another one’s treasure.

You can find novels dating back to the 15th Century; old-time radio programs; cult films such as “Nosferatu” and “Attack of the Giant Leeches.” You’ll also find obscure podcasts that someone didn’t want to pay to host, and really bad writing that a deluded author thought should be preserved forever. Garage bands upload their amateur music, but you’ll also find old Muddy Waters albums and even a collection of Grateful Dead performances. In fact, the Archive has an entire section of fan recordings of live concerts.
Internet Archive - Wayback Machine

The Internet Archive is sort of like a Goodwill Industries store. People drop off donations of all kinds; the donations get cleaned up a little bit, loosely categorized, and that’s about it. You’re welcome to come in and browse, but don’t expect to find everything. What do you want for free?

The Internet Archive got started in 1996, when founder Brewster Kahle decided to preserve a copy of every Web page ever published. Kahle and his partner wrote a Web crawler that searched for publicly available pages and copied them to the Archive’s tape drives. There they sat for over five years, occasionally visited by researchers, until a Web-based interface called The Wayback Machine allowed the public to surf back through time.

The Wayback Machine - Digital Time Travel

The Wayback Machine ( is simple enough. Just enter a URL in its address box and click the “take me back” button. Up pops a set of calendar pages. Dates highlighted in blue are dates when one or more snapshots of the URL’s page were taken. Click a date to see what that page looked like on the date in question. In addition to the general index, there are some interesting historical collections, such as Web Pioneers, World Trade Center Attack - September 11, 2001 and Hurricane Katrina.

The Wayback Machine has not captured every Web page that has existed since 1996, of course. It didn’t scan every day; often weeks or years passed between scans of the same site. But there are lots and lots of snapshots in the Wayback Machine archive. According to the website, there are over 150 billion web pages archived, dating from 1996 to “a few months ago.” But I can’t find any updates since July 2011. The site’s FAQ says to expect at least 6 (and up to 24) months before new web pages appear in the Wayback Machine.

If you're interested in the history of the Internet, and how the World Wide Web came to be, check out my related articles History of the Internet and
The First Internet Celebrity.

Brewster Kahle is honored by many as an Internet librarian and preservationist. Others think he has a hoarding problem. Kahle’s latest project is to preserve one copy of every paper book ever printed. Personally, I think the Archive and the Wayback Machine are pretty cool, and appreciate the ability to see how lame my websites looked in the past. :-)

Your thoughts on this topic are welcome. Post your comment or question below...

Ask Your Computer or Internet Question

  (Enter your question in the box above.)

It's Guaranteed to Make You Smarter...

AskBob Updates: Boost your Internet IQ & solve computer problems.
Get your FREE Subscription!


Check out other articles in this category:

Link to this article from your site or blog. Just copy and paste from this box:

This article was posted by on 14 Aug 2012

For Fun: Buy Bob a Snickers.

Prev Article:
Speed Up Your Startup

The Top Twenty
Next Article:
Seven Free Cloud Services You Should Try

Most recent comments on "Blast From the Past: The Internet Archive"

Posted by:

14 Aug 2012

'Internet Archive' is a wonderful site, check it out, you won't regret it.

Posted by:

14 Aug 2012

I have noticed that somehow folks can block the Wayback Machine from bringing up old, archived pages. I'm not sure how it happens; however, once I wanted to access a site that was no longer active and had no problems. When I went back a few months later to access it again it was blocked by the author of the site. Thus it appears that not everything will e available depending on the authors desires.

Posted by:

Bette Ide
15 Aug 2012

You missed Project Gutenberg, a wonderful source of ebooks that are free of copyright.

Posted by:

15 Aug 2012

Thanks for bringing this to our attention.
The Wayback Machine is a wonderful resource , but like Charlie wrote,some sites are blocked by the owners of the site.
Another problem is that there's a long delay before sites or updates get posted.
Sometimes 8 or 9 months or more.
I know it's a tremendous job and all done by volunteers ,which they seem to be short of.
But I had the good fortune to pick up some old programs that I couldn't find anywhere else.

Posted by:

Dr. Rohan H. Wickramasinghe
15 Aug 2012

Thank you for drawing my attention to The Internet Archive and The Wayback Machine. I deeply appreciate the help you have given me over the years.

Posted by:

24 Sep 2012

Thank You Bob, its really interesting to know of such sites that exist...Thanks again.

Post your Comments, Questions or Suggestions

*     *     (* = Required field)

    (Your email address will not be published)
(you may use HTML tags for style)

YES... spelling, punctuation, grammar and proper use of UPPER/lower case are important! Comments of a political nature are discouraged. Please limit your remarks to 3-4 paragraphs. If you want to see your comment posted, pay attention to these items.

All comments are reviewed, and may be edited or removed at the discretion of the moderator.

NOTE: Please, post comments on this article ONLY.
If you want to ask a question click here.

Free Tech Support -- Ask Bob Rankin
Subscribe to AskBobRankin Updates: Free Newsletter

Copyright © 2005 - Bob Rankin - All Rights Reserved
Privacy Policy     RSS/XML

Article information: AskBobRankin -- Blast From the Past: The Internet Archive (Posted: 14 Aug 2012)
Copyright © 2005 - Bob Rankin - All Rights Reserved