What's Inside the Wayback Machine?

Category: Reference

The Internet Archive... is it a massive collection of electrons, or a non-profit organization whose mission is to provide "universal access to all knowledge”? Actually, it's both. As part of their mission, the Internet Archive has been crawling the Web for over 20 years, making copies of Web pages and preserving them for posterity. Today, approximately 440 billion Web pages, 25 million books, millions of images, audio recordings, and video, along with 550,000 games and software programs are stored on the IA’s servers. Read on to learn how you can access this amazing resource that offers a window into both the history and the present of the Internet...

What is The Internet Archive and Wayback Machine?

The Internet Archive was founded by Brewster Kahle, an Internet pioneer and computer programmer who helped to develop WAIS (Wide Area Information System), a text-based precursor to the World Wide Web. Kahle and others founded WAIS, Inc., to commercialize the text-searching technology; their clients included Ross Perot’s 1992 Presidential campaign, the EPA, the Library of Congress, the Dept. of Energy, the Wall Street Journal, and Encyclopedia Britannica.

WAIS, Inc., was sold to AOL in 1995 (which is why you've probably never heard of it) and Kahle went on to found The Internet Archive and the Alexa web stats service (not to be confused with Amazon’s Alexa virtual assistant).

The front-end to this massive library is the Wayback Machine (which fans of Dr. Peabody and Sherman will recognize). It allows journalists, researchers, and the nostalgically curious to search for older versions of Web pages, even if the pages no longer exist on the Web. If you want to see what Yahoo.com looked like in October 1996, or view snapshots of WhiteHouse.gov over time, it's in there.

The Wayback Machine

It also allows one to submit a web page’s URL for archiving, and get a URL that will work even if the page is deleted or moved from its original site. These permanent links are increasingly important. Web URLs have gained widespread acceptance as citations in students’ term papers, Ph. D. dissertations, scientific research publications, even court filings and opinions. A “404 - not found” error is a big deal in a legal document, and the Wayback Machine helps avoid such problems. The Wayback Machine can search for archived copies of a missing page given its now-errant URL.

To make finding a lost page even easier, a browser extension is available for Chrome, and an addon for Firefox. Once installed, it automatically searches the IA every time you run into one of the various "page not found" errors your browser may return when you try to fetch a web page. (In tech terms, that would be an error number 404, 408, 410, 451, 500, 502, 503, 504, 509, 520, 521, 523, 524, 525, or 526). If archived copies of the page are found, a notification window lets you choose whether to explore them.

But Wait, There's More!

Headquarters of Internet Archive, located in the former Fourth Church of Christ, Scientist, a neoclassic building with Greek columns on Funston Avenue, in Richmond District, San Francisco, California

The Internet Archive isn't just about Web pages, though. As part of its lofty goal "to provide Universal Access to All Knowledge," the folks at IA are busily scanning books into its databases, much like Google Books does. It also preserves copies of old video games (and the emulators need to play an Atari game on a PC), software, music, movies, videos (including TV news broadcasts and live concerts), and even animated GIFs. The headquarters of Internet Archive are located in the former Fourth Church of Christ, Scientist, a neoclassic building with Greek columns on Funston Avenue, in Richmond District, San Francisco, California. As of May 2019, the IA held over 45 petabytes of data. A petabyte is one million gigabytes. Wow.

It's worth your time to browse the "Top Collections at the Archive," where you'll fund curated collections related to a wide variety of interests including Old Time Radio, MS-DOS Games, old magazines, and dozens of esoteric topics. One thing that was new since my last visit was Electric Sheep, a collection of animated and evolving fractal flames that make great screensavers. Let me know what you find there!

Your thoughts on this topic are welcome. Post your comment or question below.

 
Ask Your Computer or Internet Question

 
  (Enter your question in the box above.)

It's Guaranteed to Make You Smarter...

AskBob Updates: Boost your Internet IQ & solve computer problems.
Get your FREE Subscription!


Email:

Check out other articles in this category:



Link to this article from your site or blog. Just copy and paste from this box:

This article was posted by on 29 May 2020


For Fun: Buy Bob a Snickers.

Prev Article:
[HOWTO] Fight Spam With a Disposable Email Address

The Top Twenty
Next Article:
SEVEN Reasons Your Computer Might Crash

Most recent comments on "What's Inside the Wayback Machine?"

Posted by:

Dan
29 May 2020

I'm actually a little disappointed that it wasn't about Sherman and Peabody's Way Back Machine. :) However, still fascinating. Thanks Bob.


Posted by:

Bill C
29 May 2020


OMG Bob, this one link will keep you on my Christmas list FOREVER !!


Posted by:

PurplePenny
29 May 2020

The Wayback Machine has recently been big news in the UK. Special Advisor to the PM, Dominic Cummings claimed that he had warned of coronaviruses a year ago. Sure enough, a blog post from 2019 did mention coronaviruses.

Sadly for him someone checked the Wayback Machine and discovered that prior to April 2020 that reference was not in his blog post at all.


Posted by:

SteveD
29 May 2020

I don't believe that Mr. Peabody got his doctorate.


Posted by:

Otto
29 May 2020

All I get from Recent version and First version is 'An error has occurred' . Tried several times.

Perhaps I don't know how to use it.


Posted by:

Kathleen Dombrowski
29 May 2020

Hi Bob, Last time you posted this I was up all night and had to drink 2 pots of coffee at work the next day (eye drops too) . TGIF I can really binge to see what's new. This time I pinned it to my Taskbar. I have followed you for many years and always enjoy forgotten repeats.


Posted by:

Stuart Berg
29 May 2020

Hi Bob,
The Wayback Machine was very helpful for me when I found the website for my genealogy software (ScionPC) was gone! I assume they had folded. I not only found the website on the Wayback Machine but was also able to give a relative the Wayback link to download the software. It still works. For anyone curious, the ScionPC genealogy software is now here:
https://web.archive.org/web/20181130011145/http://homepages.paradise.net.nz/scionpc/download.html#


Posted by:

Renaud Olgiati
29 May 2020

I often find the WayBack machine very useful to read, a few days after date of the newspaper publication, articles that had been kept behind a paywall.


Posted by:

Donald R. Snow
29 May 2020

Great article, Bob. I just gave three family history classes on Internet Archive within the last month since it's got such great stuff saved. The videos of a couple of them are already posted on the UVTAGG Facebook page. Besides genealogy, I use the IA for music, both old sheet music (fake books) and digitizations of old 78 RPM jazz phonograph records. But, like someone said, you lose track of time when you start looking there.
BTW, I just clicked on my Chrome extension to tell The Wayback Machine to save your article, since it said it had not been saved.
Don


Posted by:

Ted King
30 May 2020

So Bob, Do you mean that after I have read all the current pages on the internet...I can then find more to read!


Posted by:

Ed. Shorer
31 May 2020

Talk about a wormhole! I have enjoyed going down it from time to time. I owned an Internet Cybercafe in the '90s where I taught Internet classes (using your Roadmap!) and wrote the cafe's webpages. It's nice to be able to go back and enjoy those pages that I thought were long gone.


Posted by:

Mike Davies
16 Jun 2020

Full of out of copyright stuff too, even old films.


Post your Comments, Questions or Suggestions

*     *     (* = Required field)

    (Your email address will not be published)
(you may use HTML tags for style)

YES... spelling, punctuation, grammar and proper use of UPPER/lower case are important! Comments of a political nature are discouraged. Please limit your remarks to 3-4 paragraphs. If you want to see your comment posted, pay attention to these items.

All comments are reviewed, and may be edited or removed at the discretion of the moderator.

NOTE: Please, post comments on this article ONLY.
If you want to ask a question click here.


Free Tech Support -- Ask Bob Rankin
Subscribe to AskBobRankin Updates: Free Newsletter

Copyright © 2005 - Bob Rankin - All Rights Reserved
Privacy Policy     RSS/XML


Article information: AskBobRankin -- What's Inside the Wayback Machine? (Posted: 29 May 2020)
Source: https://askbobrankin.com/whats_inside_the_wayback_machine.html
Copyright © 2005 - Bob Rankin - All Rights Reserved