[HOWTO] Searching The Deep Web
There are over 60 trillion individual Web pages as of this writing. But that's only the very tip of the iceberg. Beyond what popular search engines offer up, there is a universe of information that's online and discoverable -- if you have the right skills and tools. Here's how to gain access to the rest of the Web...
What's Out There on the Deep Web?
Sixty trillion is a big number. According to Google's "How Search Works" page, the number of web pages has doubled since 2013, and in 2008 there were "only" one trillion Web pages. Clearly, none of us will ever lack for reading material. And that's only the beginning of what's available online.
General search engines like Google, Bing, and Yahoo! index only the "surface Web," pages that have unique URLs such as the one that appears in your browser's address box right now. They don't even index all of the surface Web. Some website owners prefer not to have their sites (or portions of them) indexed, so they use a file called robots.txt that tells search engines, "Don't index this."
Search engines choose to exclude many other surface Web pages from their indexes for a variety of reasons including relevance, legality, and violations of search optimization policies. Other pages are locked behind passwords, intended only for those who are granted access.
Beneath the surface Web lies the "deep Web," a mass of information 500 times greater than the 60 trillion surface pages discovered by Google. By their very nature, deep Web resources cannot be found by the Web-crawling software that search engines use to find and index pages.
Note that the "deep" Web is not the same as the "dark" Web where criminals lurk. On the dark Web, everyone tries to hide their identities as well as what they're doing. The deep Web consists of perfectly legitimate information and its users.
Some of these Deep Web pages can be accessed only by a user clicking or manually typing a link that's not been indexed by a search engine. Other Deep Web pages can be accessed only by a user who directly enters a query in a search form. The desired data exist in a database, not on a Web page that a crawler can find by following links from other pages. The data retrieved in response to a user query is displayed as an ephemeral "dynamic" Web page that lasts only until the user moves on.
Some Deep Web Search Tools
The Library of Congress Online Catalog is a good example of a deep Web resource. It's a database containing over 18 million records of books, periodicals, audio recordings, photographs, and more. None of its records can be retrieved directly through Google. You need visit the LOC's Online Catalog page and enter search terms in the appropriate boxes.
Web archives such as The Wayback Machine store copies of Web sites that have been modified or deleted. Such archived pages are not indexed by search engines, which strive to index the current version. I've often found the Wayback Machine handy when I want to see what a website looked like in the past. (Want to see what Yahoo.com looked like in October of 1996? It's in there.)
To find deep Web material via Google, et. al, try adding the term, "database" to your search query. "Plane crash database," "drug interaction database," "government grants database," and so on, will often lead to the home page of a database where you can enter search terms specific to that resource.
There are also paid tools such as LexisNexis and Factiva which professional researchers use to find information about legal and business topics. Genealogy researchers can find a wealth of free information online, but often the best sources require payment. Ancestry.com is one such example. It's also becoming more common for online newspapers to limit free content, and erect paywalls that require a subscription to view more than current headlines.
General search engines suffice for most needs. Scholars, journalists, and other serious researchers often must resort to the deep Web. Do you use any of these tools to access online information that's not available with a quick Google search? Got any search tips of your own to share? Your thoughts on this topic are welcome. Post your comment or question below...
This article was posted by Bob Rankin on 8 Feb 2016
|For Fun: Buy Bob a Snickers.|
Can Online Voting Ever Work?
The Top Twenty
[ZAP!] Don't Buy the Wrong USB-C Cable
Post your Comments, Questions or Suggestions
Free Tech Support -- Ask Bob Rankin
Subscribe to AskBobRankin Updates: Free Newsletter
Copyright © 2005
- Bob Rankin - All Rights Reserved
Article information: AskBobRankin -- [HOWTO] Searching The Deep Web (Posted: 8 Feb 2016)
Copyright © 2005 - Bob Rankin - All Rights Reserved