Internet Archive and the Wayback Machine

Category: Reference

The Internet Archive is either one HUGE pile of data, or a non-profit organization whose mission is to provide "universal access to all knowledge.” Or both. As part of their mission, the Internet Archive has been crawling the Web for over 20 years, making copies of Web pages and preserving them for posterity. Today, approximately 280 billion Web pages from 1.5 billion sites are stored on the IA’s servers. Read on to learn how you can access this amazing resource that offers a window into the history of the Internet...

What is the Internet Archive?

The Internet Archive was founded by Brewster Kahle, a computer engineer who helped to develop WAIS (Wide Area Information System), a command-line driven precursor to the World Wide Web. Kahle and others founded WAIS, Inc., to commercialize the text-searching technology; their clients included Ross Perot’s 1992 Presidential campaign, the EPA, the Library of Congress, the Dept. of Energy, the Wall Street Journal, and Encyclopedia Britannica.

WAIS, Inc., was sold to AOL in 1995 (which is why you've probably never heard of it) and Kahle went on to found The Internet Archive and the Alexa search engine (not to be confused with Amazon’s Alexa).

The front-end to this massive library is the Wayback Machine (which fans of Dr. Peabody and Sherman will recognize). It allows journalists, researchers, and the nostalgically curious to search for older versions of Web pages, even if the pages no longer exist on the Web. If you want to see what Yahoo.com looked like in October 1996, or view snapshots of WhiteHouse.gov over time, it's in there.

The Wayback Machine

It also allows one to submit a page’s URL for archiving, and get a URL that will work even if the page is deleted or moved from its original site. These permanent links are increasingly important. Web URLs have gained widespread acceptance as citations in students’ term papers, Ph. D. dissertations, scientific research publications, even court filings and opinions. A “404 - not found” error is a big deal in a legal document, and the Wayback Machine helps avoid such problems. The Wayback Machine can search for archived copies of a missing page given its now-errant URL.

To make finding a lost page even easier, a browser extension is available for Chrome, and an addon for Firefox. Once installed, it automatically searches the IA every time you run into one of the various "page not found" errors your browser may return when you try to fetch a web page. (In tech terms, that would be an error number 404, 408, 410, 451, 500, 502, 503, 504, 509, 520, 521, 523, 524, 525, or 526). If archived copies of the page are found, a notification window lets you choose whether to explore them.

But Wait, There's More!

Headquarters of Internet Archive, located in the former Fourth Church of Christ, Scientist, a neoclassic building with Greek columns on Funston Avenue, in Richmond District, San Francisco, California

In addition to Web pages, the Internet Archive is busily scanning books into its databases, much like Google Books does. It also preserves copies of old video games (and the emulators need to play an Atari game on a PC), software, music, movies, videos, and animated GIFs. The headquarters of Internet Archive are located in the former Fourth Church of Christ, Scientist, a neoclassic building with Greek columns on Funston Avenue, in Richmond District, San Francisco, California. As of October, 2016, the IA held over 15 petabytes of data. A petabyte is one million gigabytes. Wow.

It's worth your time to browse the "Top Collections at the Archive," where you'll fund curated collections related to a wide variety of interests including Old Time Radio, MS-DOS Games, The Grateful Dead, old magazines, and dozens of esoteric topics. Let me know what you find there!

Your thoughts on this topic are welcome. Post your comment or question below...

 
Ask Your Computer or Internet Question

  (Enter your question in the box above.)

It's Guaranteed to Make You Smarter...

AskBob Updates: Boost your Internet IQ & solve computer problems.
Get your FREE Subscription!


Email:

Check out other articles in this category:



Link to this article from your site or blog. Just copy and paste from this box:

This article was posted by on 13 Feb 2017


For Fun: Buy Bob a Snickers.

Prev Article:
[BUSTED] Vizio is Watching What YOU are Watching

The Top Twenty
Next Article:
Can You Still Get Windows 10 For Free?

Most recent comments on "Internet Archive and the Wayback Machine"

Posted by:

Wild Bill
13 Feb 2017

Bob, I installed "No More 404s" and my Avast is
telling me that this software has a very poor
reputation with their users. I generally expect
you not to give bum steers. What up? A quick
search did not turn up any obvious complaints.


Posted by:

GuitarRebel
13 Feb 2017

Sorry to be cynical, but the "Internet of Things" has a huge red bulls-eye on it.
It is no longer needed in today's 'alternative facts' world and will soon be targeted for destruction.
Our entire internet experience will soon be only things approved in advance for our eyes to see.
Don't think it can happen? Time will tell, sooner rather than later.


Posted by:

Butch
13 Feb 2017

Bob, unlike Wild Bill, I had/have no problem reaching the Wayback Machine via Chrome. No message from Avast, etc.

Thanks for this interesting article.


Posted by:

Paul
13 Feb 2017

Hi Bob, great article. I have used the Internet Archive to download many Hours of old time radio. A great resource! Nice alternative to radio or recorded books on long car trips. I especially like the old mystery/adventure programs like CBS Radio Mystery Theater, Adventures of Sam Spade, Yours Truly Johnny Dollar, etc. to name just a few. The old Gunsmoke programs starring Robert Conrad are very good also, I prefer them to the TV show.


Posted by:

Clairvaux
13 Feb 2017

Very interesting article. Learned a few things on Wayback Machine I didn't know. Including the fact you can ask them to keep URLs for you.


Posted by:

Clairvaux
13 Feb 2017

And here is how to feed the Wayback Machine for your own needs. Straight from the horse's mouth :

https://blog.archive.org/2017/01/25/see-something-save-something/


Posted by:

Craig T
13 Feb 2017

The Firefox extension did not install on my computer . No error message..it just did not work.


Posted by:

Clairvaux
13 Feb 2017

Ironically, Ask Bob Rankin seems to disallow crawlers, and therefore can't be permanently linked to through Wayback Machine... Not complaining, mind you, just experimenting.


Posted by:

Darryl
13 Feb 2017

Thanks for this! I used to listen to Chickenman every night at work, and I've never been able to find any episodes on the internet. Now I have the whole series to listen to again. :-)


Posted by:

Jay R
13 Feb 2017

It would not install. (The install bar kept showing moving slash lines. For half an hour.) I couldn't seem to find this on the Firefox site. It seemed like a good idea. Frankly, I get tired of seeing the 404.


Posted by:

Ken Heikkila
13 Feb 2017

I just installed mine on Chrome no problem. Great little icon in the corner that allows you to save a page, see a recent version or the first version with a couple of clicks, very cool. Great tool in this post-fact/alt-fact world.


Posted by:

Bob K
13 Feb 2017

I tried the Firefox 'no-404' addon, several times - and it did not work.


Posted by:

Sarah L
13 Feb 2017

Wikipedia uses this when "link rot" leads to dead links. A bot is set up to find the original page, which then asks people to check that it got the correct page. I did not know I could use the Wayback Machine for myself. Cool


Posted by:

RichF
14 Feb 2017

Bob, I have enough trouble keeping up with the current internet!!!!


Posted by:

Old Man
16 Feb 2017

I tried the link you provided and found a whole bunch of old Sci-Fi movies. I downloaded several already, and will probably get more. When I get my fill of old Sci-Fi movies, I'll check out some of their other offerings - other type movies, old books from my childhood, etc.
This was a very good suggestion - Thanks a whole bunch.


Post your Comments, Questions or Suggestions

*     *     (* = Required field)

    (Your email address will not be published)
(you may use HTML tags for style)

YES... spelling, punctuation, grammar and proper use of UPPER/lower case are important! And please limit your remarks to 3-4 paragraphs. If you want to see your comment posted, pay attention to these items.

All comments are previewed, and may be edited before posting.

NOTE: Please, post comments on this article ONLY.
If you want to ask a question click here.

Free Tech Support -- Ask Bob Rankin
RSS   Add to My Yahoo!   Feedburner Feed
Subscribe to AskBobRankin Updates: Free Newsletter
Copyright © 2005 - Bob Rankin - All Rights Reserved
Privacy Policy -- See my profile on Google.
[an error occurred while processing this directive]


Article information: AskBobRankin -- Internet Archive and the Wayback Machine (Posted: 13 Feb 2017)
Source: http://askbobrankin.com/internet_archive_and_the_wayback_machine.html
Copyright © 2005 - Bob Rankin - All Rights Reserved